DOCOHBHT BESOHS 



ED 206 842 



CS 029 927 



AOTHOS 
TITLE 

IHSTITOTION 
SPOHS &6ENCT 
BEPC2T »0 
POB DATE 
CONTRACT . 
NOTE 



EDRS PRICE 
DESCRIPTOR'S 



IDEHTIFIERS 



Talliadge, 6, Kasten; Zuen, Sandra 

Stady of the Career Intern Progran. Final 

Report**** Task B: Assessment o£ Intern Outcoses^ 

RHC Research corp.. Mountain 7iev, CaXifw 

National Inst# of Education (ED), Hashington, D«c« 

HHC-OB-482 

Hay B1 

t» 00-78-0021 

180p.; For related documents see C£ 029 925-930 and 
iPH.^10 «54« 

HF01/PC08 Pius Postage. 

"^Achieveient Tests; "^Career Education; Case Studies; 

-I^rx>pou^t-^^e?ention-;~^Dropout--Pxo:g2;ajas4-^Dr^^^ 

♦Economically Disadvantaged; Experiential Learning; 
♦Field Experience Programs; High School Equivalency 
Programs; High school students; Montraditxonal 
Education; ♦outicomes of Education; Potential 
Dropouts; Program Descriptions; "^Program 
Effect! v^ness ; Secondary Education; Student 
Attrition; student characteristics; Student 
Educational objectives; Student Eva^.uation; 
success 

♦Career Intern Program 



ABSTRACT 

A study assessed the impact of the Career Intern 
Program (CIP) on participating students* (The CIP is an eilternative 
high school designed to enable disadvantaged and alienated dropouts 
or potential dropouts to earn regular high school diplomas, to 
prepare them for meaningful employment or postsecondary education^ 
and to "faciilitate their transition from school to vork by providing 
instruction, counseling, hands on career exposure, 
: diagnosis/assessment, and climate*) To evaluate student outcomes, 
standardimed reading and mathematics achievement tests vere 
administered to both an experimental and a control group on four 
occasions (upon entering the prg^gram, six and twelve months 
thereafter, and six to tvelve months after completing the program). 
The declining* number of students in the test samples (1660 students 
tested initially, 786 students tested midway into the program, and 
500 tested at its conclusion^ reflected the program's high attrition 
rates. Despite the high attrition rate (vhich may be explained, at 
least in part^ by a number of opeie(tional problems involving tight 
scheduling, funding, and unrealistic enrollment quotas) , achievement 
test;resiilta support the success cf CIP. (Belated reports evaluating 
other aspects of CIP are available separately through £Blc--see 
note.) (HI) 



^ Beproductions supplied by EOfiS are the best that can be made * 
♦ , from the original document. * 



KMC Report No. UR 482 



STUDY OF THE CAREER INTERN PROGRAM 



Final Report Task B:*^^, 
Assessment ''of Intern Outcomes 



G« Kasten Talltnadge 
Sandra D. Yuen 



May 198i 



Prepared for. the 
National Institute of Education 



KMC Research Corporation 
Mountain View, California 



U.8. DEPARTMENT OF EDUCAT»OM 

NATIONAL INSTITUTE OF EDUCATION 
EOgCATIONAL RESOURCES INFORMATION 

CENTER (ERIC) 
i This document has betn repfoduced nt 
rweived from the person or oro«m?ation 
originattn0 it. 
U Minor changes have been made to improve 
reproduction quality. 

• Pointsof vieworopinionsstatedinthtsdocU' 
tent do not oec«^rtty represent offtciat NIE 
poshkM) or policy. 



It;:'' 



The research reported herein was performed . pursuant to Contract 
No^ 400-78-0021 with the National Institute ot Education, U.S. 
Department of Health, Education and Welfare. Contractors under- 
taking such projects under Government sponsorship are encouraged 
to express freely their professional judgment in the conduct of 
project* Points of view or ofdnions stated do not, therefore, 
necessarily represent official Inititute position or policy, and 
no official endorsement by the spov^sor should be inferred. 



r ..... 



L 



"CP 



Contents 

Page 

List of Tables iv 

^ List of Figures x 

PREFACE xi 

ACKNOWLEDGEMENTS : . . xiii ' 

o 

EXECUTIVE SUMMARY xv 

I. INTRODUCTION 1 

II. • METHODOLOGY , . 13 

Instrumentation 14 

The Control Group Design y 14 

*' The Comparison Group Design 22 

The Norm-Referenced Design . . 22 

III. RESULTS . . . ^ • • 27 

Reading . . . 31 

Mathematics - « . <i 43 

Career Development Inventory . " 55 

Self-Esteem Inventory *...»• 70 

Internal-External. Scale 81 

IV. DISCUSSION . . 1 99 

V 

V 

V. SUMMARY AND DISCUSSION 109 

APPENDIX A: Comparability of tha Evaluation Designs 

Used in the CIP Study 113 

APPENDIX B: Selection of the Achievement Test to be 

Used in th|^CIP Evaluation Study 121 

APPENDIX C: Instruments 133 

APPENDIX D: The Correction for Guessing: Valid 

and Invalid Applications 157 

'^REFERENCES 163"^ 



111 



List of Tables 



Page 

Table 1 Sample Sizes by Site and Cohort at the 

Tine of Each Testing 27 

Table 2 Attrition Rates by Site and Cohort 28 

Table 3 Treatment Group Pre-to-Midtest NCE Gains in Reading: 

Estimates Derived from Norm-Referenced Analyses ... 32 

Table 4 Treatment Group Pre-to~Posttest NCE Gains in Reading: 

Estimates Derived from Norm-Referenced Analyses ... 33 

Table 5 Control and Comparison Group Pre-to-Midtest NCE ^ 
Gains in Reading: Estimates Derived from Norm- 
Referenced Analyses 34 

Table 6 Control "and Comparison Group Pre-to-Posttest NCE 
Gains in Reading: Estimates Derived from Norm- 
Referenced Analyse^ 35 

Table 7 Treatment Group NCE Gains in Reading at Midtest Time: 

Estimates Derived from Covariance Analyf^es 36 



Table 8 Treatment Group NCE Gains in Reading at Posttest Tiro^: 

Estimates Derived from Covariance Analyses , 37 

Table 9 Treatment. Group.,NCE^Ga^n§ in Reading at Midtest Time: 
Estimates Derived from St^andardized Gain Analyses, 
Third Cohort ^7^>^*^. 39 

Table 10 Treatment Group NCE Gains in Reading atSRosttest Time: 
Estimates Derived from Standardized Gain AnMyses, 
^Tlurd Cohort . . . . . . "^>^ . 40' 

Table 11 Treatment Group NCE Gains in Reading at Midtest Time: 

Estimates Derived from Matched Pairs Analyses . . • . 41 

Table 12 Treatment Group NCE Gains in Reading at Posttest Time: 

Estimates Derived from Matched Pairs Analyses .... 42 

Table 13 Treatment Group Pre-to-Midtest KCE Gains in Math: 

Estimates Derived from Norm**Referenced Analyses . « « 43 

Table 14 Treatment Group Pre-to-Posttest NCE Gains in Math: 

0 Estimates Derived from -the- Norm-Referenced Analyses . 44 



4 

IV 



Page 



Table 15 Control and Comparison Group Pre-to-Midtest NCE^ 
Gains in Math: Estimates Derived from Norra- 
^ Referenced Analyses 46 

Table 16 Control and Comparison Group Pre-to-Posttest NCE 
Gains in Math: Estimates Derived from Norm- 
Referenced Analyses ! . > i 47 

c 

Table 17 Treatment Group NCR Gains in Math at Midtest Time: 

. Estimates Derived from Covariance., Analyses 48 

Table 16 Treatment Group NCE Gains in Math at Posttest Time: 

Estimates Derived from Covariance Analyses 49 

Table 19 Treatment Group NCE Gains in Math at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, 
Third Cohort 51 

Table 20 Treatment Group NCE Gains in Math at Posttest Time: 
"'Estimates Derived from Standardized G^ain Analyses, 
Third Cohort 52 

Table 21 Treatment Group NCE ""Gains in Math at Midtest Time: 

Estimates Derived from Matched Pairs Analyses .... 53 

Table 22 Treatment Group NCE Gains in Math at Posttest Time: 

Estimates Derived from Hutched Pairs Analyses .... 54 

Table 23 Treatment Group Pre-to-Midtest Raw" Score Gains: 

^ Carec>r Development Inventory, Secpnd Cohort 55 

Table 24 Treatment Group Pre-to-Posttest Raw Score Gains: 

Career Development Inventory, Second Cohort . . ; . . 56 

Table 25 Treatment Group Raw Score Gains on the CDI 
Planning Scale at Midtest Time: Estimates 
Derived from .Covariance Analyses 57 

Table 26 Treatment Group Raw Score Gcins on the CDI 
Planning Scale at Posttest Time: Estimates 
Derived from Covariance Analyses /^*^ 58 

Table ^7 Treatment Group Raw Score Gains on the CDI 
Planning Scale at Midtest Time: Estimates 
Derived from Standardized Gain Analyses, 

Third Cohort 60 

Table 28 Treatment Group Raw Score Gains on the CDI 
Planning Scale at Posttest Time: Estimates 
Derived from Standardized Gain Analyses, 

Third Cohort 61 



V 6 



— ( 

Page 



Table 29 Treatment Group Raw Score Gains on the GDI Resources 
Scale at Midtest Time: Estimates Derived from 
Covariance Analyses 62 

Table 30 Treatment Group Kaw ^Score Gains on the GDI Resources 
Scale at Posttest.Time: Estimates Derived from 
Covariance Analyses • • 63 

Table 31 Treatmrnt Group Raw Scoi^e Gains on the GDI Resources 
Scale at Midtest Time: Estimates Derived from 
Standardized" Gain Analyses, Third Cohort 64 

Table 32 Treatment Group Raw Score Gains on Jhe GDI Resources 
Scale at Posttest Time: Estimates Derived from 
Standardized Gain Analyses, Third Cohort *• 65 

Table 33 Treatment Group Raw Score Gains on the GDI 
N ' Information Scale at Midtest Time? Estimates 

Derived from Covariance Analyses ... ^. .... /v . . 66 

Table 34 Treatment Group Raw Score Gains on the GDI 

Information Scale at Posttest Time: Estimates 

Derived from Covariance Analyses 67 

Table 35 Treatment Group Raw Score Gains on the>CDI 

Information Scale at Midtest Time: Estimates 
Derived from Standardized Gain Analyses » 

Third Cohort 68 

^Table 36 Treatment Group Rky Score Gains on the GDI 

Information S^le at Pbsvtest Time: Estimates 
Derived from Standardized Gain Analyses, 

Third Cohort . . 69 

Table 37 Treatment Group Pre-to-Midtest Raw Score Gains: ^ 
Self-Esteem Inventory, Second Cohort 70 

Table 38 Treatment Group Pre-to-^Posttest Raw Score Gains: 

Self-Esteem Inventory, Second Cohort . 71 

Table 39 Treatment Group Raw Score Gains on the Self-Esteom 
Scale at Midtest Time: Estimates Derived from 
Covariance* Analyses .* 73 

Table 40 Treatment Group Raw Score Gains on tne Self-Esteem 
Scale at Posttest Time: Estimates Derived from 
Covariance. Analyses 74 




, * * Page 

Table 41 Treatment Group Raw Score Gains on the Self-Esteem 
Scale at Midtest Time: Estimates Derived from 
Standardized Gain Analyses, Third Cohort 75 

Table 42 Treatment Group Raw Score Gains on the Self-Esteem 
Scale at Posttest Time: Estimates Derived from 
^Standardized Gain Analyses, Third Cohort 76 

Table 43 Treatment Group 'Raw Score dains on the Openness 
Scale at Midtest vTime: Estimates Derived from 
Covariance Analyses ^ . . . ; 77 

Table 44 Treatment Group Raw Score Gains on the Openness 
Scale at Posttest Time: Estimates Derived from 
Covariance Analyses 78 

Table 45 Treatment Group Raw Score Gains on the Openness 
Scale at 'Midtest Time: Estimates Derived from 
Standardized Gain Analyses, Third Cohort 79 

Table 46 Treatment Group Raw Score Gains on the Openness 
Scale at Posttest Time: Estimates Derived from 
Standardized Gain Analyses, Third Cohort . . . ^. . . . 80 

Table 47 Treatment Group Pre-to-Midtest Raw Score Gains: 

Intelrnal-External Scfale, Second Cohort 81 

Table 48 Treatment Group Pre-to-Bosttest Raw Score Gains: 

Internal-External Scal,e( Second Cohort 81 

Table 49 Treatment Group Raw Score Gains on the Internal- 
External Scale ^t Midtest Timel Estimates 
Derived from Covariance Analyses - 83 

Table 50 Treatment Group Raw Score Gains on the Internal- 
External Scale at Posttest Time: Estimates 
Derived from Covariance Analyses 84 

Table 51 Treatment Group Raw Score Gains on the. Internals 
External Scale at Midtest Time: Estimates 
Derived from Standardized Gain Analyses, 

Third Cohort ... * 85 

Table 52 Treatment Group Raw Score Gains on the Internal- 
External Scale at Posttest Time: Estimates 
Derived from Standardized Gain Analyses, 

Third Cohort \- • • • 86 



vii 



?c 



8 



I 



Page 

Table 53 Return Rates for the First and Second 

Follow~Ups by Site, Cohort, and Group 88 

Table 54 High School Status of Treated and Untreated 

Group Members: First Follow-Up, Second Cohort 89 

\ » 

Table 55 High School Status of Treated and Untreated 
/ ^ Group Members: Second Follow-Up, Second Cohort 

'Table 56 High School Status of Treated, Untreated, and 
Control Group Members: First Fallow-Up, 

Third Cohort * 91 

0 . h 

Table 57 High School Status of Treated, Untreated, and 
Control Group Members:^' Second Follow-Up, 
Third Cohort ^ 92 

Table 58 High School Status of Treated, Untreated, and * ^ 
Control Group Members: First Follow-Up, 
_ Fourth Cohort 93 




Table 59 School/Employment Status of Treated and Untreated 

^ Group Members: First Fol low-Up, Second Cohort .... 94 

» 

Table 60 School/Employment Status of Treated and Untreated 

Group Members: Second Fol low-Up, Second Cohort ... 95 

Table 61 School/Employment, Status of Treated, Untreated, 
- and Control Group Members: First Fol low-Up, 

Third Cohort 96 

Table 62 School/Employment Status of Treated, Untreated, 
and Control Group Members :o. Second Follow-Up, 
Third Cohort 97 

Table 63 School/Employment Status of TPeated, Untreated, 
and ControJ. Group Members : First Follow-Up, 
Fouii*th Cohort . * 98 

Table 64 MAT '78 Advanced Level I, Form JS, Reading Compre- • 
hension Test Items Grouped by Instructional Ob- 
jective and by Passage 130 

Table 65 CAT *77 Level 18, Form C, Reading Comprehension 
Test Items Gr?)uped by Instructional Objective 
and by Passage . . . ' 130 

Table 66 -Number and Percentage of Items Under Each Objective . 130 



viiji 



9 



Page 



X > table 67 JIIAT '78 — Advanced Level 1, Form JS, Mathematics 
, Item Number and'.Number of Items Under Each 

Objecti^^ 



Table 68 CAT '77 — Level 18', Form'C, Mathematics Computa- 
tions and Mathematics Concepts and Applications 
Item Number and Number of Items Under Each 
Objective 



131 



131 



■ List of Figures 

• • ' ^ / Pag^ 

\ . • • • ' ' \ 

Figure nT^uinmary of content and other characteristics . \ 

. of the California Achievement Test (1970)". . . ; 1 123" 

. ' ' ■ J. 

Figure 2. Summary ofNcontent and other characteristics / 

q,ftb4 California AchjLevement Test (1977) . . .\ 124 

Figure 3. 'Summjiry -of content and other characteristics^, ^^^r 

of/fomprehensive Te^ts of Basic Skills/ (.1973) . ^{ *' .125 

Figure '4. Summitry of content and other characteristics.^'^ 

of Metropolitan Achievement Test (1978 )V_y^. . . . 126 

Figure 5. /Summary of content and other characteristic^ 
^ of Sequential Tests of Educiation-al Progress 

0969)- \ 127 

J' 

<» 



PREFACE 



This report is concerned with the impact that the Career Intern 
Program has had on participating students* It io a traditional 
outcome evaluation, heavily quantitative in its orientation. Un- 
fortunately, it serves 'well to illustrate the l^imitations that 
traditional experimental >».pproaches have "when applied to social 
reform p'^rograms in field, settings. The various designs that were 
employed had to be adapted" to the practicalities of real -world 
conditions, experimental controls were inadequate, and attrition 
from all groups studied was high. In^.the* end, important assumptior*i 
underlying statistical tests were badly violated/ and serious con- 
cerns arose as to the internal validity of all of^ the analyses, that 
j^ere undertaken. / 

* In putting the report together, we have tryed. to po^int out the * 
many flaws that, exist. At the same time we have attempted to 
salvage what is useful and to piece together ^the various bits of 
evidence that have been asseitfbled/ in as meaningful a way as pos- 
sibles Jn doing so, we have dried to tie /observed outcomes > to ' 
significant implementation events^ that took /place at each of the 
four program sites. Some 6f the /inferences we have drawn are quite 
speculative, others are more defensible. . lliroughout our efforts, 
however, we were frustrated by /the infidequa'cy of the tools we had 

to use. / jfc, / • 

. - / ^ r - 

Our frustration was not unexpected. We had seen the evaluation 
of the CIP prototype and wer^ aware that/ we would encounter even 
greater problems. We were; also aware'*, as Cronbach, Ambron, Dorn- 
buscb, Hess, Hornikj Phillips/ Walker, and Weiner (1980)Lhave noted, 
that "Few evaluative experiments to date have achieved all the 
following earmarks or inteV^nal validity: - genuinely randomized 
assignme.nt; meaningful, describable treatments?; samples large enough 
■ to gi.ve reasonable statistical >power; ^and attrition low enough to 
maintain the initial equivalence" (o. 308). 

/ ■ / 

The fact that the pijbblems we ericcunterecJ in this .study 'Were 
not finique failed to^ maK^ us feel m/ich better because the report 
does not adequately reflect what^ we believe we know about the pro- 
gram. For approximately three yeark^ members, of the RMC. project 
staff have spent considerable time yOn site, have had lengthy con- 
versations with staffs and' students, /and have observed all admects of 
program operations. " Ba'sed on thesJ experiences ye believe that th^e 
CIP, when properly imjilemented, is a powerful force for reshaping 
the lives of disadvantaged arid alienated youths. We 'believe thatx 
progr^jn participants realize cognitive achieveiii3nt benefits xxnd 
develop useful career awareness./ We believe that more of them , 
graduate from high school, go on jcb f-urther education, ^ and/or ohtairf 
meaningful employment .than vould/ be^the ca;5e without the CIP.^ The^ 
evidence contained in this reriort, however," while supportive of 
these beliefs, ia not entirely conclusive. 



FRir / i<; 



In 1979, Donald Campbell took the position that, where qual- 
itative data collected through interviews and observations "are 
contrary to the quantitative results, the quantitative results 
should be regarded as suspect" (p. 53). In the case of the present 
study, the quantitative and qualitative data are in general agree- 
ment. The problem lies solely in the fact that the quantitative 
data are ^vulnerable to attacks regarding their internal validity. 
Following Campbell! s lead, we now take the position that the cred- 
ibility of tlio quahtTtative findings is substantially enhanced by 
the fact that the qualitative data also support program success. 

Presentation of all the qualitative data that support program 
succes 8 is beyond the scope of th is re port . It i s thoroughly 
do(?fiMented ^jx a companion volume (Fetterman, 1981), however, to 
which the interested reader is referred. 

In this report , we have advancea several hypotheses that may 
appear to be inadequately supported by the available data.\^ In most 
instances, the cited Fetterman report contains additional relevant 
information* Even so, some of our inferences may go beyond the 
data. We were guided by the following statement: ^ 

SpciaL scientists— are trained to suppress rela- 
tionships that do not reach statistical sig- 
nificance. However, no relation that makes 
sense ought to be discarded. We say this 
despite the truism that an explanation can be 
dreamed up to fit any adventitious result. 
(Cronbach et al., 1980, p. 315), 

We hope and believe that we have hot "dreaiuad up" explanations 
to fit the data. At the same time, we are aware that the "hard*' 
data do not, in and of themselves, provide conclusive proof that the 
Career Intern Program was successful in achieving its objectives. 
It is only when one considers the qualitative data as well that the 
argument seems to us to ^e overwhelmingly conviricing. 




^ii 



' 1 O 



ACKNOWI.EDGEMENTS 

i 



The authors are indebted to many individuals for help related 
to this report. Germin Calder, Susie Guiora, Thomas Hyde, Patrick 
Lennahan, Roberta Staples, Gail Sydnor, artid Robert Vaughan bore 
primary responsibility for on-site data collection. Linda Terhune, 
Fred Weiner, Mary Pat Gaspich, Susan Gaspich, and Kathleen Gaspich 
scored tests, coded data, and assisted with -the statistical analy- 
ses. We are very grateful for all of these contributions. 

The CIP directors and our colleagues at Opportunities In- 
dustrialization Centers of America, particularly Robert Jackson, 
also deserve many thanks for their cooperation, assistance, and 
understanding in conducting this evaluation. We are also grateful 
to Charles Stalford, Howard Lesnick, and Daniel Antonoplos of the 
• National Institute of Education for their concern and guidance and 
for the encouragement they provided. 

Finally, we wish to acknowledge the very helpful comments and 
suggestions of Robert Boruch and Andrew Porter who reviewed an 
earlier Task B report. This document is much improved as a result 
of the comments- and~suggest-ions-th€y-provided. ^ 

GKT 
SDY 



xiii 

14 



EXECUTIVE SUMMARY 



Background of the Career Intern Program 

The Career Intern Program (CIP) is an alternative high school 
designed to serve disadvantaged and alienated students' (called in- 
terns) who either dropped out of regular high schools or who were 
considered potential dropouts ♦ The objectives of the program are to 
enable students to earn a regular high school diploma (as opposed to 
a GED), to prepare them for meaningful employment, and to facilitate 
their transition from school to work* The program offer| extensive 
counseling— academic, personal, and career — and attempts to make 
academic subjects palatable and relevant to the lives of the stu- 
dents through a heavy infusion of career-oriented content* 

Run by a community-based organization, the Career Intern Pro- 
gram enjoys an unusual symbiotic working relationship with the local 
school district* It serves those students whose needs are not 
adequately met by the local high school, but the students remain on 
the local school's books* State monies that are distributed co the 
schools based on enrollment or attendance thus continue to flow to 
the_-local-high--flchoo Lbv eaJj:hough-Jthe_isXjid(B_ut.s_ar e_bAin by^ 
the CIP. The high schools award diplomas to students graduated by 
the CIP* 

The CIP was initially developed in Philadelphia in the mid- 
19708. An independent evaluation conducted by Richard A. Gibboney 
Associates (Gibboney Associates, 1977) fojind the program ^to be 
successful. The evidence of success was judged sound by the Joint 
(U.S. Office of Edu5*ati'on and National Institute of Education) 
Dissemination Review Panel, and the program was approved by that 
group as eligible for federally funded dissemination* 

.Under authori'zation of the Youth -Employment and Demonstration 
Projects Act (YEDPA, Public Law 95-93), the Department^'^KjJL. Labor 
(DOL) and tKe lfafionar Institute' oT^du^^ an 
Interagency Agreement in late 1977 to test the replicability of the 
""CIF~affd^t'0~det€rmine whether the sarae"benef icial' outcomes could be 
obtained in the replication sites* Subsequently, NIE contracted 
with the Opportunities Industrialization Centf»rc of America (OIC/A) 
to manage .the replication effort* OIC/A then, through a competitive 
bidding proeess, selected four local OIC chapters to undertake the 
CIP replication* Three of the selected sites were urban and one was 
located in a small (30,000) city* 

Overview of the. Evaluation 

The work statement for the evaluation was prepared jointly by 
NIE and DOL* Four separate tasks were" called for; 



XV 



• Task A . Conduct studies and analyses as required to answer 
the questions, "What happens to the Career Intern Program in 

' the process of implementation^ in additional sites.? What 
factors account for the changes or adaptations,, if any? For 
the fidelity, if any, to the original program goals and 
practices?" (RFP NIE-R-78-0004, p. 9) 

• Task_B. Conduct studies and analyses as required to answer 
the question, "Does the Career Intern Program continue to be 
effective in helping youth when it is implemented in sites 
other than the Philadelphia prototype?" (ibid, p, 13) 

• Task C , Conduct studies and analyses as required to answer 
the question, "What happens to young people in the CIP pro- 
gram that could account for its effectiveness?" (ibid. 



• Task D « Conduct studies and analyses as required to answer 
the fourth quest ion, "How does the CIP approach compare in 
effectiveness, feasibility, impact , and factors important for 
policy with other approaches undergoing comparable evalua- 
tions, to helping the population to be served through the 
Youth Employment Act?" (ibid, p. 20) 

To assure comparability with the original CIP evaluation, the 
work statement specified that the evaluations of the replication 
sites employ the same instruments r.id designs as that study. While 
some modifications were eventually made to strengthen the study, 
care was taken to preserve the desired comparability. 

The present report deals only with Task B. Task A and Task C, 
however, are highly relevant to the material presented herein as 
variations in the extent or manner in which individual components of 
the treatment were implemented almost certainly affected program 
outcomes. While an attempt has been made throughout this report to 
relate observed outcomes to implementation events and conditions , 
much mere detailed information is provided in the reports of the 
jthlB"r"rw'o"^ Jasks (Tr^ad\ray7"StTbmq[uist", ^ett^rman, Toat',~~firTallmadge, 
1981; Fetterman, 1981). 

Methodology 

Soci al ^ science research in the field cannot be implemented in 
strict accordance with the "rules" that govern laboratory studies. 
The primary problem for the present evaluation was very high attri- 
tion rates in both treatment and control groups. These high attri- 
tion rates rendered it impossible to determine with complete cer- 
tainty whether observed differences between groups at posttest time 
resulted from the treatment or from some other influence (including 
attrition itself). This and other problems led the investigators to 
employ a variety of different evaluation approaches and data analy- 
ses strategies. By examining the data from several di f f erent 



p. 16) 



XVI 




perspectives, it was reasoned, a more credible case could be made 
for the ^success or failure of the program in achieving its goals* 

It is beyond the sSt^ of this summary to describe each of the 
^ various techniques that was employed. Such descriptions are, of 
course, contained in the main body of the report. It should be 
noted here, however, that the different approaches yielded somewhat 
differe^it results* Furthermore, since some of the assumptions 
underlying each approach were violated, it is not clear which 
"answer" (if any) should be believed* Lest too negative a picture 
be presented, however, we hasten to point out that the differences 
among results were not extreme and all tended to support the success 
of the CIP* 

Implementation^ Events 

When the CIP is well implemented, there is reason to expect 
that it will impact positively on" participating students* When it 
is not well implemented, less sanguine expectations seem appro-- 
priate* It is important to make this point because each of the 
four CIP demonstrations experienced serious implementation diffi- 
culties at various tj.mes* Only meager evidence of succ es s could 
^" faasonatiy" be~expecfed during these times^ 

One of the sites got off to a good start but then encountered 
serious difficulties that were never adequately resolved during the 
entire demonstration period. Another site that ran for many "months 

"in a truly exemplary manner fell into disarray when its director 
departed. Two other sites experienced severe start-up problems. 
One was well on the way to recovery when its director and several 
other key staff left** The other did achieve a high degree of 

"^mplttnentation success — but not until the end of the demonstration 
period was .imminent ♦ Not ''one of ^the three cohorts of students 
studied at any of the four sites experienced a full year of program* 

^ *^ ri5tineht*^iMar red "by -some sort of .major trauma* 

The fact that regularly attending students were often not 
v> receiving a "fuU" treatment wa\ compounded by the irregular atten- 
dance of many others* In addition, some students were so poorly 
prepared academically that they simply could not cope with the 
curriculum and* should never have be^n'' admitted to the program* Both 
of these problems were direct outferow^hs of the extreme pressures 
applied to the sites, to meet -enrollment quotas* 

Taken toge'ther, the various influences described above acted in 
a manner that could only detract from the measured impact of the 
CIPs*/ Still, when, the programs were operating well there was ample 
evidence of success*- Even when all was not well, some gains con- 
tinued to be observed ♦ 



xvii 



^Results 



Evidence wa^* found that the CIPs had significant "holding 
power** over participating students. This holding power, further- 
more, varied in direct proportion to the qual-..y of program imple- 
mentation. When all program components were in place and function- 
ing smoothly, attendance was high and attrition was low. When the 
programs encountered implementation problems, attendance fell off 
and attrition increased. 

In the area of reading achievement, results over the^l2-month 
period between pre- an4 posttests, showed statistically significant 
gains when data were pooled across sites and cohorts. When the 
performance of CIP students was compared against expectat ions 
derived from normative data, however, the gain estimate was more 
than two-and-a-half times as large as that derived from the 
treatment-control comparison. While it is believed that the larger 
estimate is the more accurate one, some would argue that the smaller 
estimate was more credible. -When the performance of CIP students 
was compared against the performance ' of students in other alterna- 
tive programs, statistically and educationally significant ad- 
vantages were found for the CIP. \ 



Most of the invididual-site and individual-cohort gain esti- 
mates were statistically significant in the norm-referenced 
analyses. In the treatment-control analyses, only the across-site, 
across-cohort estimate attained significance. 

In math, the picture was similar, but the gains were somewhat 
smaller. This finding was not surprising as all of the sites ex- 
perienced great difficulty in attracting and retaining qualified 
math instructors. None of the pre-to-posttest gain estimates de- 
rived from treatment-control comparisons was statistically signif- 
icant when "normal" analytic procedures (analyses of povariance) 
were used. Under an alternative approach (standardized gain analy- 
ses), a somewhat more positive picture emerged. In the norm- 
referenced analyses , st atist ically significant gains were found for 
all three of the cohorts studied when the data were pooled across 
sites. — < 

Of the 12 individual-site, individual-cohort analyses, 5 showed 
statistically significant norm-referenced gains. Perhaps the most 
notable result of the math analyses was the fact that the gains were 
consistently positive aZ times when individual sites were known to 
have had appropriately qualified math teachers and consistently 
negative when they did not. 

When the performance of CIP students was compared with that of 
students in other alternative high schools . the results strongly 
favored the CIP group at two individual sites and in the across-site 
analysis. The same results were , obtained when CIP students were 
compared against a group of regular high school students. 



xviii 



statistically significant gains were found on all three scales 
of the Career Development Inventory (Planning, .Use of Resources, and 
Information) in several of the individual'-site analyses. Across 
sites, the gain estimates were significant in over half of the 
cases. Gains on the Information scale, although statistically sig-- 
nificant, were small. This finding was surprising in view of the 
heavy infusion of career-related material in the CIP curriculum. 
Examination of the scale's content, however, revealed that it was 
pitched at a global and theoretical level while the CIP's instruc- 
tion was at a more job-specific, practical level. 

Statistically significant gains in self-esteem were observed in 
half of the across-§ite analyses at posttest time. Interestingly, 
however, none of the corresponding analyses showed a significant 
treatment effect at midtest time. Other variables also showed 
smaller pre-to-midtest than pre-to-posttest effects, but in no 
,other case was the difference so pronounced. It was concluded that 
.changes in self-concept require extended exposure to the type of 
counsfelipg and other program features offered by the CIP. 

While it seetned logical to expect that CIP participf^nts would 
exper-ience-an increased-*senffe~"of control oyer ttiei^ liVes", scores on 
the Internal-External (locus of control) scale reflected significant 
gains in only a few scattered instances. Again, this inconsistency 
between impressions gained through extended on-site observations and 
the quantitative data was attributed to deficiencies in the instru- 
ment rather than failure of the treatment.- 

CIP participants and members of the control groups were fol- 
lowed up> in the summer of 1980 and again in January and February of 
1981. Analyses of the data obtained from these follow-ups are more 
directly related to the CIP's stated goals of helping participants 
earn their high school diplomas and enhancing their employability 
than those involving test scores. Gains on achievement, informa- 
tion, and self-concept tests may well be important, but they are at 
best intermediate goals of the program. 

Comparisons between treatment and control groups in terms .of 
the numbers who had graduated from high school, were currently 
enrolled, or had earned a GED were generally favorable. 

For Jthe fourth cohort, the high school status of the treatment 
group was significantly better than that of the control group at one 
individual site and across all four sites. This was despite the 
fact that serious implementation problems existed at one site. At 
that site, the status of the control group was better than that of 
-the treatment group (although not significant ly so). 

The third-cohort data also showed a significant advantage for 
the treatment group over the control group at one individual site. 
The negative results of the site experiencing implementation 



• xix 



10 



dif ticulties, however, prevented the differences from being sig- 
nificant across all four sites. When data were combined across the 
three sites that were not having implementation problems, a sig- 
nificant advantage was again found for the treatment group. 

The second cohort had no control group. Across sites, however, 
a larger percentage of treatment group members had graduated from 
high school, were currently enrolled, or had earned a GED, howevei, 
than was the case with either the third or fourth cohorts. This 
relationship held at both the fitst and second follow-ups largely 
because the operational problems at one site that are referred to 
above had not yet developed. " 

The second stated goal of the CIP to which follow-up data were 
relevant was that of smoothing the transition from school to work. 
Because large numbers of students were still enrolled in school, 
* however, it seemed most appropriate to compare treatment and control 
groups in terms of the -numbers either in school or employed versus 
those not in school and not employed. 

~ " The' r'esulfs of these comparisons were slightly less favorable 
than those related to high school status, but still positive. The 
fourth-cohort treatment gioup presented a better picture th^n the 
control group at one individual site and across sites on the only 
follow-up that was conducted on that cohort. There were no sig- 
nificant differences between treatment and control groups for the 
third cohort, but members of the second and third cohorts who had 
participated in the program for at least three months T;ere sig- 
nificantly better off rthan those assigned to the treatment group who 
either failed to enroll or who dropped out in the first three 
months. 

The authors expect that a more positive picture would emerge if 
information were available regarding the q'uality of jobs that were 
held\ While queries were made regarding salary levels and prob- 
abilities for advancement, too few cre*dible responses were received 
to show statistically reliable differences between groups. 

Conclusions 

There^ is substantial quantitative evidence supporting the 
success of the Career Intern Program. --Considering the number and 
severity of operational problems the sites encountered, the data are 
surprisingly good. It is especially noteworthy, however, that when 
programs were operating smoothly, the results were substantially 
more positive than w^en they were experiencing difficulties. The 
potential benefits to program participants t;hus appear to be sub- 
stantially greater than those actually accrued during the demonstra- 
tion period. 



The authors believe that the nature of the demonstration with 
its extremely tight schedule, unrealistic enrollment quotas, in- 
trusive evaluation, uncertain funding, and other generally negative 
influences/ was responsible for at least some of the difficulties 
sites encountered. The evidence from at least two of the sites 
suggests that full and smooth implementation is not an unrealistic 
expectation, however, given adequate leadership and time for the 
program to mature. Had all four sites attained this operational 
status, the results of this evaluation would almost certainly have 
been substantially more positive. 

In conclusion, it is appropriate to reiterate that this report 
covers only one aspect of RMC's evaluation of the Career Intern 
Program. The reports of other tasks must also be read in order to 
obtain a complete perspective on the CIP demonstration* Those 
reports contain substantial amounts of qualf tat ive data, including 
several case studies that should be considered in evaluating the 
prograip. As is pointed out several times in the main body_of this 
report, these qualirat ive" data lend strong support to the quantita- 
tive evidence. Both sources attest to the effectiveness of the 
Career Intern Program in reshaping the lives of disadvantaged and 
alienated youths. 



XXI X 



I, INTRODUCTION 



Background of the Career Intern Program 

The Career Intern Program (CIP) is an alternative high school 
designed to serve disadv^mtaged and alienated students (called in- 
terns) who either dropped out of regular high schools c-: who were 
considered potential dropouts. The objectives of the program are to 
enable students to earn a regular high school diploma (as opposed to 
'a 6ED) » to prepare them for meaningful employment, and to facilitate 
their transition from school to work* The program offers extensive 
counseling-**academiC) personal, and career— and attempts to make 
academic subjects p^atable and relevant to the 1?V6S of the ^stu- 
dents through a heavy infusion of career-oriented content. 

Run by aJcommunity-based organization, the Career Intern Pro- 
graa'enjbys' at^l unusual symbiotic working relationship with th^t local 
school distrii^t. It serves those students whose v need are not 
adequately met by che^ local high school, but the students remain on 
the local schoMOtl's books. State monies tKat *are distributed to the 
schools basi^d on e^y^ollment or attendance thus continue to; flow to 
the local high school even though the students are being servod by 
the CIP5 The high schools award diplomas to students graduated by 
the CIP* 

The CIP was initially developed in Philadelphia in the mid- 
1970s. An independent evaluation conducted by Richard A* Gibboney 
Associates (Gibboney Associates, 1977) ^^^"^ ^^e program to be 
successful. The evidence of success was judged sound by the Joint 
(U.S. Office of 'Education and National Institute of Education) 
Dissemination Review Panel, and the program was approved by that 
group as eligible for federally funded dissemination. 

Under authorization of the Youth Employment and Demonstration ^ 
Projects Act (YEDPA, Public Law 95-93), the Department of. Labor / 
(DOL) and the National Institute of Education (NIS) entered into an 
Interagency Agreement in late 1977 to test the replicability of the 
CIP and to determine whether the same beneficial outcomes could be 
c obtained in the replication f ites. Subsequently, NIE contracted 
with 6IC/A to manage the replication effort. OIC/A tnen, through a 
competitive bidding process, selected four local OIC chapters to 
undertake the CIP replication* Three of the selected sites were 
urban and one was located in a suall (30,000) city. 

Overview of the Evaluation 

The work statement for the evaluation was prepared jointly by 
NIE .and DOL. Four separate tasks were called for: 



22 



• Taslc A . Conduct studies and analyses as required to answer 
the questions, "What happens to the Career Intern Program in 
the process of implementation in additional sites? What 
factors account for the changes or adaptations, if any? For 
the fidelity, if any, to the original program goals and 

^ practices?" (RFP' NIE-R-78-0004, p. 9) 

• Task B > Conduct studies ^nd analyses as required to answer 
the question, "Does the Career Intern Program continue to be 
effective in helping youth when it is implemented in sites 

* other than the Philadelphia prototype?" (ibid, p. 13) 

• 'T ask C . Conduct studies and analyses as required to .answer 
the question, "What happens to young people in the CIP pro- 
gram that could account for its effectiveness?" (ibid, 
p. 16) 

• Task_jD. Conduct studies* and analyses as required to answer 
che fourth question, "How does Che CIP approach compare in 
effectiveness, feasibility, impact, and factors important for 
policy with other approaches undergoing comparable evalua- 
tions, to helping the population to be served through the 
Youth Employment Act?" <ibid, p. 20) 

e 

To assure compaiTability with the original CIP evaluation, the 
work statement specified that the evaluations of the repli(£ation 
sites employ the ssme instruments and designs as that study. While 
some modifications were eventually made to strengthen the study, 
care was taken to preserve the desired comparability/ 

The present report deals only with Task B. Task A and Task C, 
however, are^ highly relevant to the material presented herein as 
variations in the extent. or manner in which individual components of 
the treatment were' implemented almost certainly affected program 
outcomes. While- an attempt has been made throughout this report to 
relate observed outcomes to implementation events and conditions, 
small sample sizes (for any particular cohort at * any particular 
site) and other methodological problems place substantial limita- 
tiqns on the extent to which clear-cut relationships can be credibly 
established. The complexity of implementation event's and conditions 
is another factor which limits the interpretability of outcome find- 
-ings, ^and the reader is encouraged to examine the Final Task A 
Report' '(Treadway, Stromquist, Fettefman, Toat , & Tallmadge, 1981) 
and' the Final Task C Mpptt (Fette'rman, 1981) to gain a fuller 
appreciation of implementation- outcome relationships. 

The CIP r.eplication was originally planned as a two-year demon- ' 
stration although the possibility of an extension was male known 
from the ot^tset. 'It had been anticipated-^that four cohorts of 
interns would be enrolled at each of the sites during the original 
<ieraon8trat ion period. The average. size of aach cohort was planned 
to be 75 and at least 2 of the cohorts were to , be over-subscribed 
so that randomly ""assigned control groups could be formed. 

2 



23 



^ In actuality, only three cohorts were' enrolled during the 
original demonstration period at each of the four sites because of 
^severe recruiting difficulties, and the first two of them were 
smaller than the 75-member projection. Recruiting difficulties also 
precluded the formation of control groups for the first and second 
cohorts at all sites, although -control groups were established for 
the third, cohorts at all sites, 

A nine-month extension was granted to -fhe four replication 
--sit^s. During the extension period, fourth cohorts (complete with 
control groups) were taken in, 

T^*e evaluation described herein encompasses the second^ third, 
and. fourth cohortSo. The first cohort was not included for several 
reasons : 

'the cohort had entered the program at two sites before the 
evaluation contract ^was awar^i^ed, 

# it was felt that the replications needed some time to stabi- 
lize and that^data collected from the first cohorts would not 
provide reliablev indices of program effects, 

• the first cohorts at several of the sites were quite small 
and it was felt that findings based" on such small .samples 
w8uld have behn difficult to interpret. 

Participating interns and controls were pretested prior to 
enrollment, midtested sometime between 3 and 6 months after intake 
(depending on the cohort), and posttested sometime between 9 and 12 
montfrs after intake. The test battery consisted of paper-and-pencil 
te«ts encompassing reading and math achievement, career awareness, 
s'elf-concepfc, and locus of control. Second- and third-cohort 
interns and third-cohort controls were followed up in the summer of 
1980'and again in January/ February , 1981 • Fourth-cohort interns and 
controls were foUotied up only once in January/ February , 1981, 

Eat'ly^uring the period of recruitment for the third cohort, it 
appear(ed that it might net be possible to assemble enough applicants 
to the program to form both treatment and control groups. For 
this reason, comparison groups which consisted of (a) low achieving 
student's in the feeder schools, (b) students enrolled in other 
alternative-school ' programs^ and (c) youths who had dropped out of 
school were put together to provide alternative baselines against 
which to measure the success of CIP interns. These groups were 
mid- and posttested^ at the same time as the third-cohort treatment 
and control groups. 

Pre-, mid-, and posttest data summaries for all treatment, 
cpntrol, and comparison groups are presented in this report (some 
were also included in earlier Task B reports). These data were 
analyzed three different ways, making use of analyses of covariance, 

3 



24 



standardized gains,, and norm-feTereiice^^^'ap^^^ The follow-up 

data were analyzed using (primarily) Chi Square techniques. 

Summary of-Rejrevant Implementation Events and Conditions 

• The most important consideration to keep in mind when reviewing 
the outcome evaluation findings presented in this report is that the 
CIP encountered a large number of implementation problems. Many of 
these problem*s stemmed either directly or indirectly from the 
extremely compressed time schedule^ and bad timing associated with 
start-up operations. (Contracts were awarded, to the replication 
sites in^ mid-December, 1977, Staffing and training were accom- 
plished during the remainder of that month and the^ sites were ex- 
- pected to begin serving students by the end January, 1978.) A 
second major source of problems arose from the anxieties felt by 
both staff and students as the demonstration petibd d^ew to an end 
and futures were uncertain. ' y ^ 

These c^ses underlying, Implementation ^ difficulties are im- 
portant because they-- were functi(ins of the manner in which the 
demonstration was undertaken and do not necessarily/ reflect nega- 
tively on the transportability of the CIP. Despite the reasons for 
their existence,- however, there can be no doubt that implementation 
difficulties ^impacted on the "treatment" th^at uhe CIP interns re-' 
ceived. In fact^ none of the interns in any. of the three cohorts 
studied at any of the four sites experienced twelve months of treat- 
ment that was 'not disrupted. by at least one major trauma such as the 
termination or resignation of the director. " 

Brief summaries of significant implementation events at each 
sitfe follow. . Subsequent sections of the report refer back, to these 
summaries whenever they appear to be useful in understanding or 
explaining outcome findings. 

Site A . Site A got off to a good start. The director had pre- 
vious experience in setting up new organizations and proved to be a 
^ capable leader in start-up operations. ^Unfortunately, other key 
positions we'te* occupied by less well suited individuals. Neverthe- 
\ less. Site A enrolled its first cohort on March 20th, 1978, Stid 
^achieved full operational status shortly thereafter. 

The second cohort of interns (the first one studied) entered 
the program on July 24th, 1978. For the following six months (until 
midtesting in late January-rear ly February, 1 979), the program 
'operated relatively smoothly. One problem, however, was that staff 
, turnover was high— 11 of 22 staff members left the program, either 
. voluntarily or involuntarily. this high staff turnover -served to 
lower, intern attendance rates, yet morale Was high among both staff 
and attending students. There were^ however, some significant 
staffing problems that remained to be solved, particularly in the 
counseling department. 

* 



4 

25 



The third cohort .of interns entered the program at the be- 
ginning of February, and almost immediately thereafter things began 
to come apart. The pressures 'to m^et thirdycohort enrollment quotas 
for both treatment and control groups had been intense and had led 
to interpersonal animosities and a general lowet^g of morale. 
Several styrfff members voiced 'dissatisfaction with the director's 
management^ style which they perceived as aut^ritarian and un- 
professional. . * ^ 

At about this time a staff committee imposed a Code of Conduct 
and a Dress Code on uhe interns. The sudden and apparently arbi- 
trary manner in which these new regulations vere imposed produced a 
sf:rong negative reactiqii on the part of the interns who went so far 
as to stage a temporary boycott of the pa'ogram. ^ 

More serious problems arose when RNC's first report on imple^ 
mentation was published in March/ The negative comments about Site 
A were carefully culleSv from that* report and related to the CIP 
staff by the local" OIC* direct^or without any indication that the"" 
report also h^d many positive things to say. ^orale plummeted, 
dissention rose * sharply , and productive program functions ground 
""nearly to a halt. In May the CIP \director was forced to resign. A 
new director was brought in, but recovery was slow. 

In late May-early June, the 'J:h?Td-cohort interns were mid- 
tested* About a month later, second-coiort interns were posttested. 

The^new director lastedconly* a few months and was terminated in 
Septembef. During his tenure, however, there were two other resig- 
nations in ]|fey management roles. i * 

The couns^eling supervisbr was appointed* director in September, 
1979. His lack of- management experienced soon became apparent, 
however, and the staff reported serious difficulties in communica- 
tion. ^Morale did not improve and, in fact, divisiveness among staff 
-member^ increased. At afeout^ithis same time, the end of the origi- 
nally planned demonstration was drawing near. T|ie future of the 
program was unclear, although there were vague promises of an 
__extens ,ioju_^taff ^members,_b.egan. tor \^orry about their future employ- 
• ment and. this concern, was one more' factor that negatively affected 
program operations and climate. 

* * 
A nine-month extension was finally granted in December, I9T9 
and all four sites began feverish recruiting efforts to meet en- 
rollment quotas (90 treatment,. 55 control students) by the January 
3ist» 1980, deadline. 

In late January-early February, third-cohort interns were post- 
tested. At about the same time, fourth-cohort interns entered the 
program « The increased size of the student body improved the 
climate at Site A temporarily but the widening rift between the 
director and key staff persons quickly served tq^ of fset this gain. 



Midtesting of the fourth cohort took place in late May-early^ 
June, at the end of the regular school year. As the summer pro- 
gressed, it became clear that full program funding would not be 
extended beyond September/ Both staff and ijxfe^ns were increasingly 
concerned about their futures. By the end of August, when the 
fourth cohort was posttested, the program- was in disarray. Intern 
•attendance was well below 50%, three of the topsfour man^)eers 
^including the third director) had • resigned, and m^>^t remain^ing 
staff members were new and untrained. Program operations were at .a 
virtual st^andstill and staff and intern ^norale were at an all-time 
low. k ^ 

* V . 

Site B . Site B also got off to a good -stayt. Although major 
nroblems were experienced in working out an adequate agreement with 
the LEA, operations ran smoothly once that hurdle had been cleared. 
The director of the "program was well qualified and" hac^trong 
leadership skills. At least partly as a result of hid effort^, 
capable and caring, individuals were found for both counseling and 
instructional staff positions'.^— Intern recruitment was less of a 
problem at Site B than at the other sites — and consequently" less 
disruptive of other program functions. The f aci lity , altlit>ugh 
smaller than would have been desirable, was bright and pleasant and 
contributed to the overall positive climate of" the program. 

Site B enrolled its first cohort of interns on April 17th, 
1978, before the approval of the LEA had been obtained. The second 
cohort was enrolled in mid-October. Staff turnover from program 
start-up until enrollment of the second cohort was limited to two 
professionals, both of .whom had left to take" better paying jobs. 
During the next three months, two math teachers and an aide left the 
program, also to accept higher paying positions. This pattern of 
•terminations confirmed the fact that the CIP salary scale was not 
competitive. Mo-r^ importantly for the present discussion, the fact 
that there were no 'dismissals and only a few voluntary terminations 
sugge*sts that hiring practices were unusually effective at Site B 
and that there was little job dissatisfaction. / 

■ • • -y 

As was Che case at all other sites, a third cohort of interns 
was enrolled about the end of January, 1979. The enrollment quotas 
of 90 interns and 55 controls were lyiet without great difficulty 
although the entire staff and several interns had to be pressed into 
recruiting duty. 

The large number of interns enrolled at Site B exceeded the 
housing capacity of the' facility and additional space had to be 
Pleased in' a nearby building. Walking between buildings provided a^ 
temptatian to o"cut out" that some interns found impossible to 
resist. Attendance fell and the climate at the site suffered 
somewhat. The counseling staff reported that the large number of 
interns to be served precluded them from spending as much time with 
each individual as would have "been desirable. Despite these dif- 
ficulties, the program continued to run smoothly and morale was high 
among both staff and interns. 



27 




v ■ ■ ■ ; 

Midteating of "second-cohort interns took place in mid-April, 
1979, and third-cohort' interns were midtested in late May-early 
June, The pjogram was still operating smoothly but staff morale was 
beginning to be effected by- the low salaries, heavy work load, anJ, 
perhaps niost importantly, by* the Vack of vacations comparable to 
ttvQse of teachers in the regular schools. 

The jsecond cohort was posttested in mid-October, At that time, 
and in course of a mid-November site visit, symptoms of staff 
burnout were' beg;inning to emerge. This problem was exacerbated by 
uncertainties regarding the ''extension gf funding beyond the original 
* demonstration period,. Intern attendance continued to be somewhat 
lower than it was before intake of the third cohort, but all aspects 
of the program continued tp be implemented and the climate was 
^generally positive, 

- \ ~ ■ ' : ^' , 

\ In mid-December, the nine-^onth- extension became official, 
Site .B ,was well prepared and enrolled a fourth cohort in January (a 
55-member control group was also fdrmed) , Short-term anxiejties 
about the program's future" were relieved, Posttesting of the thicd- 
cohort interns was accomplished at ^ibout this same time. 

1 CIP operations continued much as before until April when the 
director" andounced* his intention to resign for reasons of career 
, advancement , His resignation had a major impact on all aspects of 
CIP operations, " Th'e deputy OIC executive director was given respon- 
sibility Ifor the program when the. original director departed. About 
' a mbnth later another OIC perison was assigned half time as interim 

acting CIP director* Unfortunately, the staff perceived these two 
individuals as temporary employees and behaved accordingly. The 
lack of leadership took its toll, ^ Intern attendance fell and a 
number of interns dropped outPof the program altogether, SeVeral 
/ staff members also chose this time to move on, 

^ . ^ Midtesting of four;th-cot«)rt interns was accomplished in late 

.^-^.May-early-^ June j^l 980, ^At. .that .time. most_p,ro^ram^„cpmppnen^ were 

stili functioning smoothly, Much^f the enthusiasm observed earlier 
-/^ had disappeared, -^owever, and the moral^of both staff and students 
was low, ^ * . 

By the end of August, when fdurth-cohort interns were post- 
testedl attendance w&s* down to about 30Z and morale was at an all- 
t|me iow,V While some, members of the ^staff were optimistic ^that 
funding would be found. wliich would enable the program to continue, 
bthers were act ively looking for other emp loyment • Al though pos- 
itive feelings- about^ the CIP continued to be expressed by both tlie 
staff and. the interns, the program bore 'hJirdly any resemblance tp 
what it had been before the original 'director resigned! 

^ ^te Cc Site C had a difficult time establishing an acceptable 
working agreement with the school district. The problem was largely 
due to pressures brought to bear on tlft' LEA by the local teachers* 



union.* It was not helped, however, by the ^facC that the CIP leader- 
ship was inexperienced and underqualified. The person appointed 
director had not, in 'fact, sought that job* He had applied for the 
position of counseling supervisor but was named director when no 
more-qualified person could be found on short notice. Other staff 
positions were also filled with marginal people and, even after an 
agreement with the school'^astrict had been worked out (through the 
intervention of OIC/A), stMxf inadequacies at all levels plagued the 
program... 



Despite the fact that it had no viable working agreement with 
the LEA, Site C was the first site to enroll interns. This event 
occurred on February 23rd, 1978. The cohort comprised 38 interns, 
all of whom had previously dropped out of school. 

A working agreement with the LEA was finally signed on July 
13th, 1978^^ but it was not until three months later that a second 
xohort of interns was enrolled. Severe recruiting ^difficulties had 
been— encountered and only 46 interns (and no controls) had been 
signedxup. The program, nevertheless, was operating smoothly at the 
time or RMC's October, 1978, site visit except for the divisiveness 
and low staff morale that resulted from inadequate leadership. 

October , November, and December were months of intensive re- 
cruiting activity. Enrollme^nt quotas of 90 interns and 55 controls 
had* to be met by January, 1979, or the program would, most probably, 
have been shut down. Instructional and counseling activities were 
reduced to a bare minimum as staff and interns alike engaged in a 
wide variety of recruiting activlt^ies. 

During this same time period, the local OIC realized that some 
action^ would have to be taken regarding CIP leadership. The OIC 
executive director temporarily took over the CIP directorship. The 
original director was retained, however, in the hope that he would 
learn some of the skills he lacked during the interim period. 

' The "catchment area" for recruiting was extended to include 
three additional LEAs. As a result, the enrollment quotas were met. 
Also as a result, however, the Site C CIP had to > accommodate the 
curriculum and graduation requirements of four LEAs rather than just 
one. At the time of RMC's second visit to Site C (February, 1979), 
the entire counseling staff was inundated with paperwork associated 
with the rostering of the new interns into the courses they needed 
to graduate* The counselors were frustrated that they had so little 
time to spendTcounseling, and thei^ morale suffered as a result. 

Confusion regarding Cl? leadership also had its impact on staff 
morale and an atmosphere of paranoia prevailed as various individ- 
uals 'jockeyed for position and maintained written logs of the trans- 
gressiorfs of others* The interns, too, sensed the program's dis- 
array* Derogatory graffiti began to appear on the lavatory walls 
and clusters of students began' to "hang out" in the hallvays* Even 
so, they cont inued to compare the CIP favorably wit'i their former 
high schools* 



8 

2S 



^On March 2nd, the original director was reinstated on a pro- 
visional basis. The situation worsened almos: immediately, however, 
and he was removed permanently at the end of the month. 

In April, second-cohort interns were midtested. The program at 
that time was at its lowest point, but an interim director with 
appropriate credentials had been appointed and there was some reason 
for optimism. In early May RMC again visited Site C. While intern 
absenteeism continued to be high and most program components were 
being implemented perfunctorally , if at all, staff morale was 
definitely on the rise* 

A strong and well qualified person was appointed permanent 
director in mid-May. Shortly thereafter, third-cohort interns were 
midtested. 

Gradually the new director began rebuilding the program. In- 
effectual "staff members were replaced and vacant slots we^e filled. 
New procedures were developad and installed and "things began to 
happen." At aboiit this time, discussions were going on within the 
D<p*irtment of Labor regarding the possible extension^ of the demon- 
stration period. \dOL was aware, however, of Site C's problems and 
was seriously considering extending only the other three sites. 
Site C 'knew it was\ "under the gun." Some staff members were demor- 
alized believing that they would be "sacrificed" to provide an 
object lesson to the other sites. 

A representative of DOL visited Site C in June, 1979, osten- 
sibly to determine whether the site should be terminated. His visit 
was so perfunctory, however, that rite personnel were left with the 
impression that the decision had already been made. Again morale 
wa9 negatively affected, but efforts continued to pull the program 
back together. 

The summer wasl a period of intense revision, reform, and up- 
grading of operations in preparation for DOL's final review of the 
-prPiiywLlchcdule^ fir October. In September, a it became clear that 
an additional cohort of i^iterns co"uld not be accommo'dated^in -the 
current facility. Feeling that the chances of extension were good 
(and night be enhanced by a more suitable building), a search was 
conducted, and a suitable place was located. The CIP moved in 
October, with staff -md students completing the entire moving opera- 
tion themselves. 

btt October 30th, 1979, DOL made its long-awaited visit to the 
site and found it sufficiently improved to be granted the aame nine- 
montk extension planjied f<ir the other sites. Recruiting activities 
began in earnest as j a goal had been established of enrolling iOO 
interns and obtaining 75 controls. Pot^ttesting of the •second-cohort 
interns was done at about this time. ^ 



In January, 1980, a new cohort of 66 interns was admitted to 
the program* A control group of 29 members was also found. Al- 
though these numbers fell far short of the established quotas, they 
were accepted. The program, at this time, was almost fully imple- 
mented, intern attendance was good, and staff morale was high. 
While some problems remained, things had never been better at Site 
C. It was at this juncture that posttesting of the third-cohort 
interns was accomplished. 

RMC visited Site C in April, 1980. Program operations were 
obs,erved to be running smoothly and intern attendance was high. 
Staff morale, however, was not as good as during the previous visit. 
There was evidence of burnout. More importantly, however, the end 
of the demonstration period was drawing near. Staff were beginning 
to worry about finding new jobs and complaints about inequitable 
pay, lack of adequate vacation time, and related issues had begun to 
surface again. Another contributing factor was that monies promised 
for the extension period had ^een held up in Washington. Local 
funds were used in the interim but they were limited and, for a 
time, operations proceeded on a day-to-day basis with real concern 
that th^ site would have to shut down for lack of funds. Despite 
these problems, all major program functions were carried out in 
compliance with the program model. 

In late -May-early June, fourth-cohort interns were midtested. 
In mid-June the CIP director announced her plan to resign from the 
program in mid-August. Subsequently, the instructional supervisor 
and the reading specialist tendered. Xheir resignations. All three 
left the program in mid-August while RMC visitors were on site. 
Although the local QIC was confident of its ability to find strong 
leaders to fill the vacant positions (and there is evidence that 
they succeeded) morale among the remaining staff members was ob- 
served to be very low. During the period between the submission of 
resignations and actual departures , prob lems common to lame duck 
administrations emerged. Staff members who were staying resented 
those who were leaving and felt disinclined to follow their in- 
structions. 

In late August-early September, fourth-cohort interns were 
posttested. 

Site D . Site D, like Site C, did not get off to a good start. 
The original director not only lacked the skills and experience 
required by the job, but was guilty of duplicity in dealing with her 
staff. The local QIC executive director believed in "management by 
exception" and provided little leadership or guidance. 

In mid-April, 1978, the CIP staff moved into the remodeled, 
parochial school that was to house, the program for the duration of 
the demonstration. Both prior to and after that time, the CIP 
director and the QIC executive director tried unsuccessfully to work 
out 6n ■ acceptable cooperative agreement with the school district. 



10 



31 



Eventually OIC/A intervened. OIC/A met with the Site D school board 
on May 5th, ^1978, and an agreement was signed five days later. 

The first cohort of 23 interns was enrolled at the end of May, 
1978. Recruiting for the second cohort began immediately but, by 
September, it was already clear that Site D would not be able to 
identify enough candidates to form both treatment and control 
groups. NIE waived the control group requirement and on October 
16th, 41 new interns were enrolled. 

RHC visited Site D in November, 1978 and found the program to 
be in disarray. Most of the problems appeared to be the result of 
deficient leadership. The director had isolated herself from all of 
the staff except the instructional supervisor. Most communications 
to other staff members — even to the counseling supervisor — were by 
memorandum. Not surprisingly, this situation led to factionalism 
throughout the remaining staff. Some wiere deeply resentful and did 
not hesitate to discuss their feelings with th<2 RMC site visitors. 
Others chose to side with management; while still others tried to 
stay out of the conflict and simply do their jobs. 

As could be expected, staff morale was low, the program climate 
was dominated by self-centered concerns,, and implementation of 
instructional and counseling functions was mechanical at best. The 
interns were sensitive to -all of -these problems and were attending 
-"sj?oraBTcaTly. Attendance was observed to be below 50% and, through- 
out the course of one afternoon, only 9 of the 47 enrolled students 
were observed in the building. 

OIC/A was aware of the worsening situation at Site D and, in 
December, 1978, prevailed upon the local OIC executive director to 
remove the CIP director and instructional supervisor. The OIC/A 
deputy director of the CIP demonstration th3n ' stepped in and took 
control of the program. He remained at Site D for some three months 
establishing new procedures, training staff, and generally reshaping 
the program. He also found that relationships, with the feeder 
schools had been impaired by misinformation and negotiated new 
agreements. Finally, he was instrumental in finding a new director, 
who joined the CIP on March 12th, 1979. 

In January, 1979, while the program was being directed by the 
OIC/A deputy demonstration director, enough applicants had been 
recruited tto form a third-cohort of interns as well as a control 
..group. At the time of- RMC's site visit a month later, staff morale 
was very high, intern attendance had'risen to approximately 70%, and 
the program climate was positive, caring, and supportive. One of 
the instructors had been promoted to instructional supervisor and 
was proving to be both competent and well respected by her staff. 
While problems remained, the program had improved dramatically and 
appeared well on its way to full implementation. 

The second cohort was midtested in mid-March — just about the 
time the new director joined the program. She was a strong and 



11 

32 



experienced leader who operated with an inclusive and democratic 
management style. The progress made during the OIC/A intervention 
continued under her direction. When RMC visited the site again in 
Hay, the program was operating smoothly and morale was high among 
both staff and interns. There had been a, substantial number of 
intern terminations between the March and May visits, but the 
attrition appeared to be largely the result of excessively zealous 
recruiting. Interns had been taken into the program in order to 
meet enrollment quotas who were not adequately motivated and who 
never seriously intended to remain. It was at about this time 
(mid-May, 1979) that third-cohort interns were midtested. 

Over the summer of 1979, the CIP ran a reduced program to 
accommodate the interns ' need for employment. Arrangements were 
made with_seyeral summer youth programs that enabled interns to 
attend classes in the mornings and work in the afternoons . In 
September,^ the CIP resumed full operations when the public schools 
reopened. ^-In mid-October the second-cohort interns were posttested> 

RMC visited Site D again in December, after the„ program had 
been granted an extension through September^ of 1980. About 65. 
interns from~ the first three cohorts were still active and, although 
the numbers were small, staff and student morale were highj the 
program climate was very positive, and program functions were 
operating very well. Recruiting for the fourth cohort was underway 
and it was clear that there would be little difficulty in meeting 
enrollment quotas. Relationships with the feeder schools had be'-- 
come so positive under the new director's leadership that the CIP 
was allowed to set up recruiting booths in the buildings and use 
theNpublic address system for announcements. 

In January, 1980, a new cohort of 100 interns was enrolled. 
At approximately the same time, third-cohort interns were post- 
tested. RMC visited the site again in March and found the program 
still running smoothly. Attendance had stabilized at about 70% and 
morale continued to be high. Staff turnover (mostly £ • reasons of 
advancement) was somewhat of a problem, but the program seemed able 
to attract well qualified replacements for those who left. 

Plans were underway for obtaining funds from alternative 
sources so that the program could cont inue beyond the nine-month 
extension period. Proposals had been submitted to the CETA prime 
sponsor, a private foundation, and the /state. Everyone was opti- 
mistic about the outcomes and there was little of the concern over 
job security that was observed at the other sites. 

In May, 1980, fourth-cohort interns were midtested. 

RMC's final'^vifiit to Site D occurred in August, 1980, while 
fourth-cohort interns were being posttested. The situation was much 
as it had been in March. The program was operating smoothly and 
both interns and^ staff were enthusiastic and working hard. One of 
the long-term staff members commented, "it's smooth sailing now," as 
she recalled her first year and a half with the CIP. 



12 

33 



II. METHODOLOGY 



This study has employed a variety of data analyses techniques. 
The majority of these techniques were applied to the analysis of 
scores on paper-and-pencil tests administered to GIP interns and 
members of the control and comparison groups prior to the intake of 
each cohort at each site and Napproximately 6 and 12 months there- 
after. These analyses , are discussed immediately below* Follow-up 
data were also collected approximately 6 and 12 months after post- 
testing. For the most part, these data^ consisted simply of fre- 
quency counts l3'f youths in various school and employment categories. 
The methods used to analyze these data are discussed at the end of 
this chapter. 

Analysiis of Test Data 

Aa— mentioned'~i'rr^£Ke^Introduction, this portion of the study 

encompassed the simultaneous implementation of a control group 
experimental design, a comparison group design, and a norm- 
referenced design. Only the control group design was called for 
in the request for proposal, but a decision was made by the time of 
contract award to supplement it with a norm-referenced evaluation 
since large-^ and -(possibly) differential attrition of students was 
expected from the treatment (GIP). .and control groups. Such attri- 
tion, if it occurred, could create serious doubts regarding the 
validity of inferences drawn from comparisons between treatment and 
control groups. 

The evaluation was further supplemented by the inclusion of 
various comparison groups approximately nine months after the study 
began. This step was taken becau^se the sites were experiencing 
•serious difficulties in recruiting "sufficient numbers of students to 
fill treatment group quotas while also providing adequate numbers 
for the control groups. It was feared that control groups might 
have to be abandoned altogether or that they would be too small 
to provide a stable baseline against which to measure treatment 
effects. 

Constraints were imposed on the evaluation by a number of 
circumstances associated with GIP operations at the four sites. 
These, constraints typically required that the standard procedures 
associated with each design be modified. In some cases the modifi- 
cations were substantial and significantly affect the manner in 
which the^ analyses should be interpreted. While the authors believe 



• AH of the various designs that were used attempt to measure 
the impact of the GIP. Each, however, rests on different sets of 
assumptions and asks a slightly different question. Appendix A 
presents a comparison of 'the designs in these terms. It is included 
for the methodologically inclined reader and need be of no concern 
to others. 



V 



13 

34 



that the inferences they' have made and the conclusions they havr 
.drawn are sound and credible, the reader is advised to note care- 
fully all the cautions and caveats contained in the following 
descriptions of how each design was implemented. 

Instrumentation 

The study described herein used much the same instrumentation 
as was used in the evaluation of the original GIF in Philadelphia 
(Gibboney Associates, 1977) • Both evaluations used standardized 
reading and mathematics achievement tests. The original study used 
subtests of the Stanford Achievement" Test (1973 edition) while the 
present study used the Metropolitan Achievement Test (1978 edition), 
because the latter Instrument .was— considered "to^ be substantially 
-better— stiited'^for use with the GIF target papulation than the 
former. Before the final selection was made, a care fuL" examination 
of thirteen of the most commonly used achievement tests was under'- 
taken. A summary of this evaluation is included as Appendix B of 
this report. 

Other instruments used in the original study were the Gareer 
Development Inventory (Super; 197,0), the Self-Esteem Inventory 
(Goopersmith, 1967), the Internal-External Scale (Rotter, 1966), and 
the Standard Progressive Matrices (Raven,' 1940). These same instru- 
ments we^re^used in the present study (except for the Standard 
Progressive Marric¥8,~ copie8"~are" included in Appendix G. There was 
one difference, however; the Standard Progressive Matrices test was 
used both pre and post - in the original study whereas it vas used 
only as a pretest in the present study* This change was made in 
response to a. -suggestion made by the NIE Project Officer. 

With, the exception of several pretest sessions at one site, all 
testing was accomplished„.by RMG-emplbyed site assistants with 
^ppr:opriate professional qualifications. Ths" few test sessions not 
conducted by RMG were run by a senior-level graduate student in 
psychome^rics who was employed as a GIF math teacher at the time. 
He was trained by the regiilar-RMG tester at that site and was judged 
to be well qualified. 

The^ControL Group Design 

The I evaluation of the original GIF in Philadelphia made use of 
a xandoinly assigned .control group in order to g/'ierate a baseline 
against i^ich the growth of GIP ^participants could he measured.' 
More ^candidatesr were recruited for the program than could be served, 
and; a Idttery^like procedure was then used to determine which 
applicants w6uld.be assigned to the control group and which would be 
admitted -co the -program* At nfid- and posttesting times, members of 
the .coiitrja^ g^oup were paid to complete the instruments. ^ 

; The (eyaluators noted, several problems with thisj approach 
(Gibboney^ Associates, 1977), First, many oJr^<|ie, conjirol group 
students who^ returned for mid- and posttesting laClcea motivation 



and were observed to mark their answer sheets at rando% While 
an "at tempt was made to compensate for this problem, through applica- 
tion of a statistical adjustment, the results were unsatisfactory. 
(See Appendix C for a discussion of valid and invalid uses of the 
correction for guessing.) 

A second problem was that attrition from both treatment and 
control groups was very high (approximately 50%) at the time of raid- 
testing and 70% by posttest time (see Table 2, p. 28 for a breakdown 
by site and by cohort) and it seemed likely that attrition from the 
two ^groups Was non-random and that ^ases light have_resulted which 
.would--^ompromi-8e"inference8^"drav^ from subsequent treatment-control 
comparisons. The more able or more highly motivated control group 
students » for example, might have changed schools or taken jobs thus 
making them unavailable for mid- or posttest data-collection ses- 
sions. On the other hand, the treatment group students wh3 were not 
present for these sessions could easily have been those at the other 
extreme of the distribution who would or could not do the work 
required to remain in the program. Other hypotheses may be equally 
p^lausible, but the fact remains that while random assignment may 
have assured parity^ between the original groups, that parity could 
w^ 11 have been .destroyed by differential attrition. 

RMC attempted' to deal with each of these problems through 
design modifications. To contrpl for random responding, students 
were paid for correct responses on those instruments where responses 
coltrd be judged either correct or incorrect. The details of this 
incentive payment strategy are discussed below. 

To combat the differential attrition problem, a decision was 
made to adopt a matching strategy that entailed the formation of 
dyads or triads of students (depending on treatment and control 
groiip quotas) who were as much alike as possible in terms or iden- 
tifiable, educationally relevant characteristics. ^ One member of 
each dyad or triad was then selected for the control group while the 
reriiaining members were invited to enroll in the CIP. The plan was 
to limit comparisons bet-reen treatment and jcontrol groups to those 
dyads or triads where it was possible to obtain mid- and/or posttest 
data on the control group member and at least one treatment group 
member. While this procedure would reduce the size^of the.evalua- 
t ion sample, it would also presumably eliminate the bias that 
might otherwise have resulted from differential attrition. The 
details of the watching procedure. are alscNdescribed below. 

* Incentive payment strategy . Because it^ seemed/likely that 
students with no stake m the study (control and comparison group 
students) would not put forth thsiir best efforts^when responding to 
mid- •and posttest questions, a decision^ was made to provide an 
incentive, in the form of a cssh payment, for correct responses. 
Thus, in addition to paying students $10.00 foiT coming to data- 
collection sessions, they were paid $.07 for each item they answered 
correctly. To avoid the problems that might have arisen from 



differential reinforcement, members of the treatment group received 
the same cash incentives. 



There were 190 correct answers on the reading and math achieve- 
ment tests, the Standard Progressive, Matrices, and the Information 
scale of the Career Development Inventory (items making up the 
other instruments or scales had no correct answers). Students could 
thus earn as much as $13.30 for coTrrect responding plus the $10.00 
for attending. In fact, typical payments were in the $18.00-$20.00 
range. Tests were scored immediately, and students were paid in 
cash or by check within minutes of completing the last instrument. 

While the incentives were considered generous, it became clear 
that they were not entirely successful in achieving the desired 
results* In one instance, students who had been scheduled for 
testing were observed playing basketball on a court outside the 
school. They could not be lured in for data collection. In at 
least two other instances, students were observed marking their 
answer sheets without referring to the test bookfet. Despite these 
occurrences, it seemed clear that the incentive strategy was at 
least moderately successful. The majority of test scores appeared 
to be valid ^he anomalies observed in 'the Philadelphia evalua- 
tion data (e.g., mean posttest scores being lower than me^n^prjetjest 
scores) .were eliminated. i 

Incentive payments were made to members of the comparison 
groups at pre-,* mid-, and posttesting times. There were no such 
rayments to treatment or control group members at pretest times 
• Ance they were motivated to do well in order to qualify for ad- 
mission to the CIP. Both treatment and control group members were 
paid, however, at mid- and posttest times* While treatment group 
members would probably have been adequately motivated without in- 
centive payments, there was evidence that they would have resented 
nrt being treated in the same manner as the other groups. 

Matching treatment and control students . The variables on 
which students were matched were primarily pretest scores and age. 
Separate matchings were undertaken for reading^ and math. Where a 
surplus of good matche'6 could be achieved on the two primary vari- 
ables, grade level, and number of academic credits needed to grad- 
uate from high school were also considered. This set of criteria 
was incomplete and -would have been expanded to include at least 
pre-CIP school attendance rates had it been possible to obtain this 
information. Nevertheless, a large proportion of total among- 
student variance was brought under experimental control by the 
matching process* 

It was not expected that perfect matches could be achieved even 
'under ideal circumstances* As it happened^ however, circumstances 
were far from ideal* Severe problems were encountered in recruiting, 
adequ&te numbers of students to meet treatment group quotas. For 
this reason there was no control group for the second cohort and the 
plan to serve four cohorts during the original demonstration periods 



16 



had to be abandoned (although a fourth cohort was served during the 
extension period)* 

• *' 

Recruitment for the third and fourth cohorts extended over a 
very/long time period, Mar\y pretesting sessions had to be scheduled 
witl/ small numbers of candidates tested at each session* Program 
staff at the CIP sites* felt ,that potential interns were being lost 
due to lengthy delays between being tested and being informed as to 
whether or not they would be admitted to the program* As a result, 
they requested that treatment and control group assignments be made 
at the end of each week in which testing occurred and that candi- 
dates be notified of their status* ^ 

The need to assign students to treatment and control groups on 
a weekly basis interfered substantially with the matching process* 
Typically, data were available on only a few students, and the 
formation of well matched dyads or triads was often impossible* 
Despite this difficulty, the matching procedure was continued («s 
well as it could be) and selection of students for the control gro.ip 
continued to b^^ random from each dyad or triad* It was felt that, 
while treatment and control group assignments could not be changed, 
it would be legitimate to improve the matching of treatment with 
control group members after all the students had been pretested 
(Cook & Campbell, 1979, pp* 47, 48)* Such post^hoc matching, of 
course, would have to be done without any knowledge about the status 
of students' after the pretest since such knowledge"(e*g* ^ that a 
student selected for the treatment group had chosen not to enroll) 
could clearly bias the matching process and, thereby, the results 
of any subsequent analyses* 

The matching (or rematching) process was further complicated by 
the . fact that pretesting spanned a time interval of more th&n four 
months* Because reading and jnath skills develop over time, it 
seemed unlikely that a student would obtain 1^he„ same -test -score^rf 
tested in late January that he or she had actually obtained when 
tested' ia the middle of the preceding September* It follows that 
two students who obtained identical scores tested at widely dif- 
""erent times would not have obtained identical test scores had they 
een tested at the same time* 

Adjusting test scores for different testing times * Because of 
the problem just discussed, it was considered necessary to attempt 
some form of statistical adjustment to obtain estimates of the 
scores students would have achieved had they all been tested ^^.the 
same time* This' adjustment was accomplished for readihg and^math 
achievement-test scores through uee of normative data* The pro- 
ptdure was as follows* 

The assumption was made that students whose scores placed them 
at a particular percentile rank in the national distribution at time 
T. would tend to score at the same pe-'^entile rank ^at time T2* 
(This same equipercenti le assumption also underlies the norm- 
referenced evaluation^ design described later in this chapter*) 



17 

38 



Given the equipercentile assumption, a test score, and a test date, 
it follows that interpolating between adjacent empirical normative 
data* points can yield estimates of the score that would have been 
obtained on any other particular test date. Unfortunately, the 
process is not quite as clear-cut as it appears on the surface. 

^ The most salient complication to the interpolation process 
stemmed from the fact that percentiles do not constitute an equal- 
interval scale. Thus, if a test score obtained half-way between 
ad jacent . empi;jical normative data points was found to correspond to 
the 25th percentile in the earlier norms and the 5th percentile in 
* the later norms, it would be incorrect to infer that the^- inter- 
'oolated value would be the 15th percentile. (The 12th percentile 
actually lies midway between the 25th and the 5th.) This particular 
difficulty was overcome by converting percentiles to normal curve 
equivalents (NCEs) before interpolating. 

The )second complication related* to the fact that cognitive 
growth rates are not lineais over the twelve months of each calendar 
year. This complication could not be resolved as satisfactorily* a^ 
the first because little is known about the exact shkpe of the 
growth function. What is known, however, is thatgrowtn is slower 
over the summer than during the school year — parficiilarly for 
low-achieving .students (Tallmadge, 1978; National Institute of 
Edtication, 1078; Thomas^ & Pelavin, 1976; Tallmadge' Horst, 197^) , 
This difference in growth rates can easily be seen an most test 
publishers' norms tables by comparing the gain in' standard-score 
points per month between fall and the following spring with the .gain 
bfetween spring and the following fall. Unfortunately, 'it seems 
likely that further non-linearities exist since the spring-to-fall 
interval usually ranges fitom sometime in April to sometime in 
October and thus encompasses several months of the school year as 
well as the summer vacation. 
I • 

I * If one assumes that cognitive growth propeeds at one more-or- 
lless'constant rate while school us in session, and at a slower, but 
Idlso constant rate over the summer, then it would be appropriate to 
/use the Octo'ber-'to-April growth rate from September to June and, 
/subsequently, to determine a** June-»to-September growth rate using 
whatever annual gain reinains. Although alternative rationales could 
have been developed (e.g., it could have l^'een assumed that start-up 
would be sloW and that the tchool year would end with a tailing off 
,of growth), the approach described was the ode adopted. 



/Normal * cuifve equivalents are normalized standard scores with 
mean of 50 and a standard deviation of 21.06 (when a nationally 
representative sample of any age/grade group is tested}'. They match 
percentiles at values of ' 56, and 99 but, under /the assumption 
that the attribute measured is normally distributed in the popula- 
tion, they constitute *an equal-interval scale. 



\ 18 

39 



September 15th and June 15th raw-score-to-NCE norms -trabl'es-'were 
generated by* extrapolating from the October 15th and April 20th 
Metropolitan Achievement Test normative data points. These extrap- 
olated norms tables were subsequently used to obtain interpolated 
NCEd for each^tudent as a function of his/her own particular 
testing date. 

Caution ^ A word of caution should be inserted at this point* * 
The procedure just described must be regarded as a poor substitute 
* for testing all srudeuts on the same date* (For norm-referenced 
evaluations I the testing date should also correspond to one of the 
test.' 8 .empirical normative data points*) While the authors believe' 
the aplproach taken was sound — and that there ^yas no better way to 
-de Jil with—the -need for 'stag;;.ered' testing — smalL errors have almost 
certainly been introduced. It seems unlikely that the magnitude of 
such errors would be sufficient to obscure any .educationally sig- 
nifi^^t treatment effect , but even that possiblity must be acknowl- 
edged, o ^ ' 

Selecting appropriate norm groups . An additional problem needs 
to be mentioned. Most GXP interns ranged from 16 to 21 years of age 
but a few exceptions to this age-range requirement were made for 
various reasons,- Many prograiSvparticipants were dropouts who had 
been out of school for varyin*^ amounts of time. Most of those who 
had not dropped out were classified as jS'hiors or seniors in their 
respective high schools even though they lacked too many credits to 
^graduate with their classes,* Others had been held back one or more 
years. For these various reasons , it was often not clear what tiorms 
tables were most appropriate for individual students. 

Ultimately^ a decision was made to categorize students accord- 
ing to their ages vather.than their grade levels. The age of each 
student as/ of October 2nd of the academic year they entered the 
prpgram was determiMed, Youths whose ages were between lA and 14,95 
\were treated as 9th graders* Those between, 15 a<^d 15,95 were 
treated as lOth graders. Those above 16 were treated^ as 11th 
graders. Regardless of their agesi no students were treated as 
12th graders at pretest time since 12th-grad« norms. "^(the highest 
level of norms tables) had to ^e reserved for usei with 'the posttest 
scores of interns classified as 11th graders when they efttered the 
program, 

Out-of-level testing , . A final but minor probletc related to the 
test-normmg issue is that all treatment, control^^ and comparison 
group students were tested out , of level, ^ That is, although the 
majority of the students could be considered as iOth,. lith| or 12th 
graders, they Wre tested with the lev^lc^ the Metropolitan 
Achievement Test (Advance^^evel 1) intended for 7th-through-9th 
graders. This testing approach was adopted deliberately in view of 
the fact that most' cf the students tested ^vere kno%m to be low 
achievers, ' Many would find the in-level, test too difficult i and 
their scores > asja result , would be unreliable* 



Aldhough the test itself was designed for students in grades 
7, 8, and 9, it was possible to gain access to lOth-, llth^, and 
12th-grade norms by means of the (vertical) scale scores. With the 
Metropolitan Achievement Test (1978 edition) the process is as 
follows: (a) the out^of^level raw score is converted to a scale 
score, (b) the scale score is converted to an in^level percentile 
^rahlcT"^^^ (c^)~t1ie~iT^^ to an NCE. 

Unadjusted measures . The techniques used to adjust achieve- 
ment-test scores for differences^ in testing dates could not be 
applied to the Career Development Inventory, the Internal-External 
Scale ) or the Self-Esteem Inventory because no normative data were 
available. Since none of these measures was used for matching, 
however, and since the ratio of treatment to control group students 
was approximately the same for each testing date, no biases in 
treatment-control analyses should have resulted from this failure to 
adjust. * 

Systematic influences may be present in the treatment-vs.- 
comparison-group analyses since most comparison-group students were 
tested later ^in the year than treatment and control students. On 
the other hand, the nature of^ the measures, coupled with the fact 
that the treatment did not begin until after all students (treat- 
ment, control, and comparison) had been pretested suggest to the 
authors that the differences in testing times would not- signifi- 
cantly affect the evaluation findings. Again, however, readers are 
cautioned that this inference may be questionable. 

Analyzing the data . It was originally intended that all 
treatment-control comparisons would be based on intact, matched 
dyads or triads of students. This strateg^ was employed to counter- 
act the potentially biasing influences of differential -attrition. 
Unfortunately, the rate of attrition was very high and the number of 
intact groups available for analysis was correspondingly low at^^ali 
sites. * Matched-groups analyses were undertaken, but they were 
supplemented with covariance and standardized-gain analyses in order 
to capitalize on the larger sample sizes that were available for 
these analyses. > 

The matched-groups analyses were all performed using /t_ tests 
for paired observations. This type of analysis is exactly compar- 
able to a single classification analysis of varia/ice. These analy- 
ses Here done separately for each site and for each criterion 
variable. ^ • / 

The covariance* and standardized-gain analyses empl^byed -in this 
study were conducted using three somewhat different approaches. 
Traditional covariance analysis (see Winer, 1971) employs a common, 
within-group * (treatment and control) post-on-pretest regression 
line. Similarly, the traditional standardized-gain analysis makes 
use of a common, within-group principal axis of the treatment and 
control gt'oups* bivariate scatter plots (see Kenny, 1975, and 



20 



Tallwadge, 1978). In both cases, the underlying assumption is that 
thiBse within-group statistics provide better estimates of the 
population values than either of the individual lines. Because this 
assumption may often be questionable, RMC elected to conduct three 
versions of each analysis, one using the control group's regression 
line/principal axis, one using the treatment group's, and one using 
-the-common-wi th in«group-r^gr esaionJLine/pjdjMLij^ 



tions of these analyses are given in the Results section of this 
report ^ 

The traditional covariance analyses (that used^^^ttT€*''=*«cra^ 
within-group' regression line) employed the standard £ test (Winer, 
1971, p* 77.2). Exact F tests were not worked out for the covariance 
analyses that employed just the comparison group's regression line 
or just* the treatment grou|>'s regression line. Approximate F ratios 
were calculated using the denominator from the standard covariance 
analysis and the gain estimate -squared as the numerator (gain « 
treatment group's adjusted mean post test score minus comparison 
group's adjusted mean posttest score). Although the values calcu- 
lated in this manner may differ slightly from the" exact, least 
squares Fs, the differences should be small in all cases and should 
not affect any interpretations of the results* 

To the authors' knowledge, no exact 1 test has yet been worked 
out for standardized-gain analysis. The approximation used here 



was 



2 

(Difference in adjusted posttest scores) 
SS^ * SSj, 



where 



SS^ - IyJ - (EY.p^/n^ + .b2[ZX^-(ZX^)^/n^] - 25(1 X^Y^-(ZX^) (I Y^)/n^l 



and 



b 



(X, - x^) 



21 



As was the case with the covariance analyses, three different 
versions of the standardized gain analysis were computed, one using 
the slope of the common, within-group principal axis, one using the 
slope of the comparison group's principal axis, and one using the 
slope of ^e treatment group's principal axis* 

The __Compari8on Group Design 

Approximately nine months after this study began, recruiting 
difficulties experienced at all four sites made it clear that 
control groups available for the study would be of minimally 
acceptable size. For this reason DOL/NIE decided to supplement 
the evaluation through the employment of various comparison groups* 

A brief feasibility study led to the conclusion that, in three 
of the sites, it would be possible to form comparison groups of (a) 
potential dropouts in feeder high schools who had not applied for 
admission to the CIP, and (b) participants in other alternative- 
school programs* In one of thea^ three sites it appeared that a 
group of actual dropouts not participating in any academic program 
could also be assembled* The future of the fourth site (Site D) was 
uncertain at that time; therefore no attempts were made to form 
comparison groups* 

Most members of the various comparison groups were pretest;ed in 
January, 1979* A few were tested in late December, 1978, and a few 
in'* early February, 1979* They were mid-tested in May and June, 
1979, and were posttested in February and March, i960* Raw scores 
on the reading and math achievement tests were converted to inter- 
polated NCEs using the same procedures employed with the treatment 
and control groups* No adjustments were made to scores on the other 
instruments to compensate^ for differences in treatment and compar- 
ison group testing dates^^ Unfortunately, these differences are more 
likely to impact on the/ comparison group analyses than on the con- 
trol group analyses* Wnile students were assigned to treatment and 
control groups short Iv/after each pretesting session, thereby ef- 
fecting a proportidMl balance, this was not the case with the com- 
parison group* All comparison group students were pretested near 
the end of the four-month interval during which treatment group 
students were pretested* 

All comparison group analyses were done using covariance and 
standardized-'^gain procedures* Pretest scores were used as the 
single covariate*' 

The Norm- Referenced Design 

Norm-referenced evaluations of various types have been popular 
fdr many years. Recently one such design was developed for nation- 
wide use in evaluating projects funded under Title I of the Elemen- 
tary and Secondary Education Act (Tallmadge & Wood, 1976)* Evidence 
from a study which compared gain estimates derived from that norm- 
referenced design with ones derived from simultaneously implemented, 

22 



43 



random-assignment experiments > suggests that the two types of 
estimates are about equally accurate — at least under the circum- 
stances that were studied (Tallmadgei 1981)* 

The model is based on what has come to be known as the equiper** 
.centile assumption that was referred co earlier. This assumption 
holds that, in the absence of any special educational intervention, 
student«-^H-xetain--theirt^ercentile"(ar-NCE)^^^ 

to a .notm group over time. Pretest status thus becomes predicted 
posttest status, and gains . are. measured by subtracting predicted 
posttest status from actual postt^aT^t^tus (Posttest NCE r Pretest 
NCE). . 

There are two steps in the prop^ure recommended for imple- 
menting the nonirreferenced model thit' were not feasible in the CIP 
Evaluation. First, all testing (pre-,, mid-, and post-) should 
be accofliplikhed within about two weeks of the test's empirical 
noming date(s). Unfortunately, not only did the cohort intake 
dates preclude such timing, but recruiting difficulties necessi-* 
tated extending the pretesting period over four months (in the case 
of the third cohort). In an attempt to deal as effectively as 
poaaible with this problem (as mentioned earlier), the Metropolitan 
Achievenent Test's October 15th and April 20th norms were first 
extrapolated to September 15th and June i5th. Each student's raw 
score^ was ' then converted to an NCE by interpolating between the 
extrapolated norms tables according to hie or her individual treating 
date. : Some error was certainly introduced by this procedure, but 
its magnitude is thought to be small and cannot be accurately 
predicted. 

The second model-implementation problem concerned the rule that 
a single set of test scores cannot be used both to select students 
for participation in a program and as their pretest measure. \Hhen 
this rule is violated, a sputious regression to the mean occurs, and 
gains are artlfactually either inflated or reduced. In the cip, 
students were required to read at the fifth-grade level (more 
accurately, the entry criterion was set at one standard error of 
.measurement below the fifth-grade reading level). Some, candidates 
scored below this level and were denied admission to the program. 
To the extent that this happened, students were indeed "selected on 
the pretest," since they were not re-pretested after being Accepted 
into the CIP. 

In the authors' opinion, the biasing influence of pretext 
selection was small because, except in one site,^ the great majority 
of students scored well abdve the cutoff. To the extent that a- bias 
does exist, however, it will cause gain estimates to be too low. 
The^ norm-re fetenced evaluations will thus tend to be conservative. 
Real gains may be slightly higher than the norm-referenced estimate. 

/ 

All norm-referenced evaluations were conducted using the 
standard paired-observations t test. 



i3 

44 



The test-score analyses described above involved all control 
group students who could be attracted to the data collection ses-> 
sions by the monetary and other incentives that were offered* As 
far as the treatment group was concerned, only those interns who 
were active participants at the time of testing or who had graduated 
were included* No attempt was made to test youths who had been 
invited to join the program but who had failed to enroll or who had 
terminated prior to the testing session* For the two follow-up 
studies that were undertaken (the first in the summer of 1980 and 
the second in January/February , 1981), a slightly different approach 
was ~t'aken* 

Attempts were made to contact all youths assigned to the 
treatment groups and all assigned to the control groups* If direct 
contact could not be established, information about these youths was 
sought from schoo 1 personne 1 and records ; and from re lat ives , 
friends, and neighbors* Had we succeeded in obtaining information 
on all of the youths, the "true experiment" with which the study 
began would have been preserved* Whatever "treatment effects" might 
have emerged from the analyses would have been unaffected by pos- 
sible self-selection biases and highly credible* Despite intensive 
efforts that included door-to-door canvassing of neighborhoods, 
however, the return rate was slightly below 80% « (See Table 53, p* 
88, for a breakdown by site and by cohort*) V(hile this return rate 
was surprisingly high considering the much smaller number of youths 
from whom it was possible to obfain test scores, it was not high 
enough to remove all possibility of bias resulting from differential 
attrition* Still, it should have reduced it« 

While including untreated members of the treatment group in the 
analyses serves to maintain the integrity of the design, it also 
minimises the size of treatment effect estimates, sinc6 gaines mads 
by treated students are at least partially offset by the zero ex- 
pected gains of the untreated students* The latter consideration 
led RMC to subdivide the treatment group into treated (those who 
enrolled in the CIP and remained a minimum of three months) and 
untreated (those who did not enroll or left the program in less than 
three months) subgroups* In weighing evidence from the two follow- 
ups, it should be kept in mind that comparisons between treatment 
and control groups will systematically underestimate the size of 
treatment effects while those between the treated subgroup and 
either the untreated subgroup or the control group will systemat- 
ically overestimate treatment effects (because of self-selection 
bias). 

The follow-up data lent themselves to two major comparisons* 
The first compared groups in terms of high school status* The 
proportions from each group who had graduated from high school, were 
currently enrolled, or had earned GEDs were contrasted with the 
proportion who had dropped out of school prior to graduation and 
had not earned GEDs* The second major comparison contrasted groups 

2* 



45 



in terms of those members who were either enrolled in school (high 
school, college, GED, or vocational) or employed, as opposed to 
those who were neither enrolled nor employed. 

Data f rom • the first and second follow-ups were analyzed sep*- 
arately. The analyses were conducted separately by site and by 
cohort as Aiell as across cohorts and across sites, just as was done 
with^ the test score analyses » 




\ 



25 




III. RESULTS 



This chapter summarizes the findings of the entire outcome 
evaluation task. It is organized under three major headings; (a) 
holding power, (b) test-score outcomes, and (c) follow-up findings. 

Holding Power 

Table 1 presents the numbers ^of treatment and control group 
youths who were: (a) pretested, (b) midtested, and (c) posttested 
by site and by cohort. Table 2 presents the same data but reduced 
to attrition rates from pre-to-midtest and from pre-to-posttest . 
These data are intended to provide some indication of the CIP*o 
ability to retain youths after they enrolled. Unfortunately, for 
reasons explained below they are somewhat misleading. 



Table 1 

Sample Sizes by Site and "Cohort 
at the Time of Each Testing 



Pretest 



Midtest 



Posttest 



sice 


Cohort 


Treatment 


Control ' 


Treatment 


Control 


Treatment 


Control 


A 


II 


65 




21 




18 . 






III 


loe 


55 


32 


. 19 


22 


16 




IV 


101 


55 


30 


27 


21 


18 




Total 


. rm 


TIo 


-13 


"55 


61 
















15 




B 


11 


76 




40 








III 


121 


60 


88 


25 


50 


20 




^ IV 


75 . 


'74 


41 


32 


32 


26 




Total 


272 


134 


169 


57 


97 


46 


c 


II 


49 




28 




9 






III 


120 


54 


47 


30 


21 


14 




IV 


66 


29 


53 


10 


34 


12 




Total ■ 


235. 


83 


128 


40 


64 


.26 


D 


. II 


67 




15 




6 






III 


118 


55 


52 


15 . 


33 


16 




IV 


176 


106 


77 


54 


67 


50 




Total 


361 


161 


144 


69 


106 


66 


k\\ 


II' 


257 




104 




48 






ill 


467 


224 


219 


89 


126 


66 




IV 


418 


264 


■ 201 


123 


154 


•106 




Total 




45? 


524 


212 


328 


172 



Table 2 

Attrition Rates by Site and Cohort 



Site 




Pro— ^n 




rre~ CO 


rost test 


Cohort 


Treatment 


Control 


Treatment 


Control 


A 


II 


68Z 


— 


722 


— 




III 


702 


652 


802 


71% 




IV 


702 


512 


792 


67% 




Combined 


702 


■ 582 


782 


69% 


B 


' II 


472 


— . 


802 


— 




III 


272 


582 


59% - 


67% 


- 


IV 


- 452 


572 


572 


65% 




Combined 


382 


572 


642 


66% 


C 


II 


432 


— 


822 






III 


612 


442 


822 


74% 




IV 


202 


662 


482 


59% 




Combined 


462 


522 


732 


69% 


D 




782 


— 


912 






^ III 


562 


732 


722 


71% 




IV 


562 1 


492 


. 622 


53% 




Combined 


602 , 


572 


712 


59% 


All 


II 


'602 




812 






III 


532 


602 


732 


712 




IV 


522 


53% 


632 


602 




Com&ined 


542 ■ 


57Z 


712 


652 



As can be seen (most easily from Table 2), the attrition rates 
from the treatment and control groups are quite similar when com- 
puted across sites for cohorts III and IV ds well as across the 
three cohorts* None of the differences even approaches statistical 
significance. This finding appears to suggest that the progr'am's 
ability to retain youths was quite low* This appear.ance^ however, 
is very deceiving. All youths assigned to the control group were 
encouraged (and paid) to participate in the mid- and po^sttesting 
sessions. Of those assigned to the treatment group, however, only 
thoae still enrolled in the program and those who had graduated were 
permitted to take the mid- and posttests* In other words, youths 
had to do something to stay in the treatment group but nothing to 
stay in the control group. 

If one makes the assumption that some of the ineligible members 
of the treatment group would have' returned for testing had they been 
invited, a very different picture emerges. To illustrate, suppose 
ineligible treatment group members would have returned for testing 
at half the rate at which members of the control group returned (a 
cbnaervative estimate, we believe). Had this happened, there would 



48 



have been 48 more third-^cohort treatment group members at midtest 
time and 49 more at post test time. The corresponding increases for 
the* fourth dohort would have been 51 and 53. 

Prom pre-to-midtest the attrition rate for the third-cohort 
treatment group would thus have been 43Z compared to 60Z for the 
control group. This difference would have been significant at the 
/.Ol level,^ one tailed (Chi Square » 6.17, df » 1). Pre-to-midtest 
attrition rater for the fourth cohort would have been 40Z for the 
treatment group versus 53Z fo^ the control group. T^is difference 
would also have been statisticx^lly signif i,c;>nt (Chi Square 3.64, 
df « 1, p < .05 one tailed). 

Pre-to-posttest attrition rates would not have been signifi- 
cantly lower for either the ^third- or iourth-cohort treatment groups 
than for the corresponding control groups. If the two cohorts were 
combined, however, the treatment group rate (57Z) would then have^ 
been ^^ror^than that of the control group (65Z) at the .05 (one 
tailed) confidence level (Chi Square « 3.54, df « 1). 

It is not clear exactly how these numbers should be inter- 
preted. It does seem, however, that th^ey provide reasonably con- 
vincing evidence of the existence of a treatment effect. 

A. literal interpretation can say only that significantly higher 
percentages of treatment ^roup members could have been mid- and 
posttest^d than was the cas^e for members of the control grot/p. How- 
ever, since this difference is clearly attributable to thos^ membeis 
of the treatment group who attended the program (non-attending 
treatment group members were assumed to return for testing at a 
rate only half that observed in the control group), a case can be 
made that the program did have significant holding power. 

At thift juncture, it should be pointed out that treatment group 
students who were attending the program were easier to locate end 
inform of the testing sessions than ..control students. This dif- 
ference no doubt contributed somewhat to the apparent treatment < 
effect* The authors d(t not believe, however, that it could have 
been totally responsible* ^ , 

Between-site and between-cohort differe '^.es are somewhat 'easier 
to interpret. Although the across-cohoct, pre-to-posttest (junad- 
justecO attrition ratest among treatment group members iuve not sig- 
;nificantly different over all sites (Chx Square « 6.90, df « 3, 
.10 > p > •OS two tailed), a comparison or Site A !#ith Site B 
produced a significant Chi Square (6 .SB with 1 degree of freedom, p 
< * 02 two tailed).. The direction of the flifference, furthermore, is 
consistent with the general iinpr^ess ion that Site A liad the least 
success in attaining full program implementation, while Site B was 
fully , implemented for the largest portion of the demon^cration 
period (see site descriptions in Chapter I). 




Perhaps the biggest difference between sites occurred at the 
beginning of the demonstration ^period when Sites A and B got off to 
good starts while Sites C and D suffered .through serious management 
problems. Second-cohort pre-to-posttest Attrition rates fot treat- 
ment group youths are again consistent with this observation, being 
lower at Sites A and B (77%) and higher at Sites C and D (87%). The 
difference between these attrition ^rates is statistically signifi- 
cant (Chi Square » 3.18, df = 1, p < ,05 one tailed). 

During the tenure of the third cohort. Sites B and D were 
functioning well while Sites A and C continued to experience diffi- 
culties. Third^cohort attrition from Sites B and D was 65% while 
that from Sites A and C was 81%. Again, this difference is- highly 
significant (Chi Square « 8.64, p < ♦010. 

While the fourth cqhort was attending. Site D attained full 
implementation while Site A continued to have a difficult time. 
Attrition rates for the two sites were 62% and 72% respectively. 
This difference, too, is statistically significant at the .05 level 
(Chi Square « 4.76).y Sites B and C continued to operate well, at 
least for a large Jpetcent age of the time, despite the resignations 
of their directors. * A comparison of the combined attrition rates 
for Sites 6, C, and D (58%) with that observed at Site A ia again 
highly significant (Chi Square « 7.43, p < .01). 

Program iapleraentation detej^iSrated'' at Site A as a function of 
time while it improved at Sites C and D. The improvement at Site D 
occurred earlier, however, than at Site C. To determine whether 
th€^d implem«»«tation changes were acconpaniecJ by. corresponding 
changes in at.:rition rates, the following compar is<:o.s were made: 

• At Site As the pre-to-posttest attrition rate of the second- 
cohort t reatment group was compared aga'nst that of the 
third- and fourth-cohort treatment groups combined. ^ The 
di^fi rence, vhile in the predicted direction, was found no^ 
to -j^^ y'-^x^istically significant (Chi Square = .93). 

• At Sii^ C, the pre-to-*posttest attrition rate of the second- 
and third-co't.ort treatipent groups (combined) was compared 
against tha^ pf the fourlh-cohort treatment gioup. The 
difference was in the predicted direccions and statistically 
significant at the .001 level (Chi Squ^^ro « 14. J7, df « 1). 

% At Site D, the pre-to-posttest attrition rate of the second- 
cohort treatment grou^ was compared against that of the 
third- and fourth-cohort treatment croups combined . Again, 
the differenc^^ was "^in the predict«-d di^t-ection and was sta- 
tistically significant at the .005 level, one tailed (Chi 
Square « 10.34, df « 1). 

These various findings, taken together, cons' I'Cute a , convincing 
body of evidence that attrition is inversely related- to the quality 



30 

50 



\ 



or extent of program implementation. When the CIP is well imple- 
mented it has significantly better holding p6wer over participating * 
students than vf)en it is less well implemented. 



r . Test Score Outcomes 

The following "pages contain a complete summary of all ^lialyses 
performed on test scores during the three-year CIP demonstration 
period** These analyses >'are organized first by subject matter and 
then by type of analysis within subject matter. Finally, for each, 
type of analysis, the pre-to-midtest results are presented prior to?s, 
the pre-to-posttept "results. 



Reading . ^ 

Tables 3 and 4 present the results of the norm-referenced 
at^alyses performed on treatment groups scores. Table 3 summarizes 
the\ findings of the pre-to'-midtest analyses while Table 4 'encom- 
pass^ the pre-to-posttest findings. ^ As can be seen, most of the 
gainsXare statistically significant. Combined across sites and 
*cohoj^ts^ the mean pre-to-posttest gain is 2.6 NCEs and the pre-to- 
pqsttestV gain is 6.7 NCEs (just short of one^-third of a national- 
sample standard deviation). 

The prertOTmidtest results, when combined across sites, show 
that the smalle'st gain was made by fourth-cohort students.^ This 
finding is perhaps best explained by the short pre-to^midtes t 
interval for the. fourth-cohort (3.5 to 4 months). The largest gain 
wa'a made by third-cohort students — a fact largely attributable to 
the results at Site .D. *While the large gain at Site D may have 
resulted from the aramatic turn-around that occurred at that site, 
the small negacxve gain made by fourth-cohort students, when imple- 
mentation 'at Site D^was even better, seems to contradict this 
hypothesis. It may be that the disruption which followed enrollment 
of the large fourth cohort (130 interns) was responsible* for the 
poors showing, but that inference borders on pure speculation. 



The analyses reported here all employ £ or £ tests. Because 
many such tests are reported, \their tabled probability levels are 
too low* WhUe this problem could theoretically have been avoided 
by emplojinj one overall analysisy^of variance and various subanaly- 
•es* within it, the design would have been extremely complex. Fur- 
thermore, interpretive explanatioiis of results at the level of. 
fourth-order interactions (where individual site, individual cohort, 
•ingle criterion, norm-referenced evaluations would fall) are so 
cumbersome that the distorted probability levels of multiple and £ 
tests were viewed as the lesser i>l two, evils. 



31 



51 



Table 3 

Treatment Group Pre-to-ftidtest NCE Gains in Reading: 
, Estimates Derived from Norm- Referenced Analyses 







Pretest 


Midtest 


NCE 








Site 


Cohort Y 


NCE Mean 


NCE Mean 


Gain 


N 


t 


P 


A 


II 




45.6 


1.1 


, 21 


.50 


— 




III , 


35.8 


39.4 


3.6 


^ 32 


1.21 






IV 


31.0 


36.9 


5.8 


30 


3.22 


.005 




Combined ■ 


3o .3 


40 . 1 


3.8 


83 


2.65 


.005 


B 


II 


» 32.5 


36.2 


3.2 


40 


2.56 


.01 




III 


■38.8 


41.4 


2.6 


87 


4.64 


.0^ 


• 


IV 


32.0 


34.3 


2.2 


41 


1.43 


.05 




LomDinea 


j5 .o 


Jo 


2.8 


168 


2.87 


t\t\ c 

.005 


C 


II 


36.2 


37.7 


1.5 


28 


.57 






III 


37.9 


40.9 


3.0 


47 


1.97 


.05 




IV 


38.2 


41.2 


3.0 


.53 


1.93 


.05 




Combined 


37.6 


40.3 


2.7 


128 


2.63 


.005 


D 


II 


31.5 


35.4 


3.9 


15 


1.43 






III 


32.5 


37.4 


5.0 


52 


2.39 


.025 




IV 


29.2 


28.2 


- .9 


77 


.66 






Combined 


30.6 


32.3 


1.7 


144 


' 1.53 




All 


. II' 


35.8 


38.4 


2.6 


104 


2.46 


.o; 




III 


36.6 


40.0 


3.4 


218 


3.51 


.001 




IV 


32.4 


34.2 


1.8 


201 


2.15 


.0,25 




Combined 


34.8 


37.5 


2.6 


523 


4.75 


.001 



When the data are combined across cohorts within sites, Site A 
emerges with the large.si_^pre-to-nnidtest gain. This finding is 
exactly the opposite of what one would expect based on 'what is known 
about implementation events at the various sites. ' The pre-to- 
posttest results, on the other hand, place the sites in approxi- 
mately the predicted order* 

The pre-to~posttest results show a marked improvement in per- 
formance with successive cohorts at Sites C >and D and overall* 
Again, this finding is consistent wi^h expectations based on imple- 
mentation events* The large gain made by fourth-cohort interns at 
Site B, on the other hand, is counter-intuitive* One can only 
speculate that the disarray resulting from the director's departure 
did not affect the efficacy of instruction related to the develop- 
ment of reading skills* 



32 

S2 



Table 4 

Treatment Group Ere-to-Posttest NCE Gains in Shading: 
Estimates Derived from. Norm-Referenced Analyses 







Pretest, 


Posttest 


NCE 










V/UIIU t L 


IJP17 Maii>n 

SiKfCt nesn 


u/^e Moan 


Gain 


n 


t' 
L 


P 


A 


II 


45.2 


49.7 


4.5 


18 


1.87 


.05 




III 


32.6 


36.6 , 


4.0- 


22 


j.38 






• IV 


34.6 


37.7 


3.2 


21 


.87 






X lie u 




40 8 




61 


2 20 


025 


B 


II - 


34.0 


37.2 


3.2 




1.23 






• ' III 


41.4 


48.5 


7.1 


50 


2.82 


.005 




IV 


32.4 


40.8 


8.4 


32 


4.68 


.001 




vomo jnea 


17 1 
J/ .J 


AA 9 






A A7 




A 

- C 


II 


31.0 


29.0 


-2.0 


9 


.61 






III 


34.0 


39.6 


5.6 


21 


1.64 






IV 


39.6 


48.3 


8.7 


3A 


4.98 


.001 




Combined 


JO .5 


AO 7 


o.z 


0** 






p 


11 


33.5 


33.7 


.2 


6 


.06 






III 


34.8 


42.3 


7.5 


33 


2.54 


.01 . 




IV 


30.5 


40.2 


9.7 


67 


5.47 


.001 




Combined 


' 32.0 


40.5 


8.5 


106 


5.76 


.001 


All . 


II 


37.6 


39.9 


2.3 


48 


1.61 






III 


36.9 


43.3 


• 6.4 


126 


4.39 


.001 




IV 


33.5 


41.8 


8.3 


154 


7.79 


.001 




Combined 


35.4 


42.1 


6.7 


328 


8.51 


.001 



In general, the results of tl^e norm-referenced reading analyses 
appear' quvte positive with the 328 students in the pre*to-posttest 
sample showing growth (on the average) from the 24th to -the 35th 
percentile of the national distribution* This appearance of suc- 
cess i^^howeverr^ii - somewhat lessened when one examines the norm- 
feferenceB gains made by control and comparison students* As shown 
in Tables 5 and 6^ most of these groups also made statistically 
significant norm-referenced gains, some of which are actually larger 
than those made by the CIP participants* 

Comparisons, between the norm-referenced gain estimates • for* 
third-cohort treatment and control groups Ifavpr the treatment group 
at all four sites and overall at posttest time* For- the fourth 
cohort, the treatment group out-performed the control group at two 
sites and overall* The midtest results are slightly less favorable* 
For ' the. third 'cohort, treatment groi:p gains are larger at three' 



^ • 33 



53 



Table 5 

Control and Coaparison Group Pre-to-Midtest NCE^Gains in Reading: 
>EstiTOate8 Derived from Norm-Referenced Analyses 









Pretest 


Midtest 


NCE 






• 


i/onorc 


Group 


NCE He an 


NCE He an 


Gain 


N 




P 


A 

A 


TT T 

ill 


Control 


35.5 


36.8 


1.3 




19 


.34 


— 




T T T 
iii 


Reg* HS 


46.7 


49.0 


2.3"^ 


.55 


.96 






. Ill 


Alt« HS 


45.6 


47.0 


1.4 


'50 


.59 






TT T 
III 


Dropout 


47.0 


46.4 


- 1.2 


19 


, .34 


— ' 




^ IV 


Control 


■34.4 


35.6 


1.2 


27 


44 




« B 


III 


Control 


35.2 


38.6 


3.4 


25 


1.39 






. Ill 


Reg. HS 


32 r7 


36.8 


4.2 


51 


3.00 


.005 




III 


Alt. HS 


40.6 


44.3 


* 3.60 


54 


2.94 


.005 


> 


IV 


Control 


36.6 


42.8 


6 3 

U.J 






.UUj 


C 


III ■ 


Control 


41.9 


42.2 


.3 




ai 






III 


Reg. HS 


45.9 


52 5 


6 6 


J J 




.UuD 




III 


Alt. HS 


57.1 


56.5 




IQ 
jy 


in 






IV 


Control 


42.5 


41 .0 




10 


.*tH 




« 

■p 


III 


Control 


^ 32.4 


34.2 


1.8 


15 • 


.55 






IV 


Control 


.33.3 


35.9 


2.6 


54 


1.48 




All 


III 


. Control 


37.0 


^ 

<'38.7 


1.6 


89 


U14 






III 


Reg, HS 


42.0 


46.4 


4.4 


161 


- 3.86 


.001 




III ' 


Alt. HS 


46.9 


48.6 


1.7 


143 


1.52 






III' 


Dropout 


47.6 


^6.4 


- 1.2 


•J iqi 


.34 






IV 


Cor^trol 


35.2 


38.1 


2.9 


■123' 


2.59 


.01 



.of- the four sites and overall. The fourtN-cohort control group, 
however, outgained ^the treatment group at t*o sites and overall. 
The midtest result at Site D is 4gain difficul't tc accept at face 
value in view of what is known of implementation events at chat sit6 
and the fact that the same grourp shows a very^large gain at post test 
time (9.7 NCEs). 

* < *. 

In the case of the comparison groups^ both the regular high 
school and the. dropout groups outgained third-cohort CIP partici- 
•pants at posttest time.* The'regular high school group also out- 
performed the CIP group at midte/jt time. • 

Why the gains made^by the regular high school group are so 
' large is not clear. There is no reason to believe that these 
schools wejre doing ^^n outstanding job teaching their students to 
read. #A more plausible explanation is that some sort of selection 
took place-*-perhaps by the classroom teacher motivated to look good, 

* 34 - 

54- 



1 
1 



Table 6 

Control and Comparison Group Pre-to-Posttest NCE Gains in Reading: 
'E8t7;Tiates Derived from Norm-Referenced Analyses 







Pretest 


Posttest 


NCE 


1 






Cohort 


Group 


NCE Mean 


NCE Mean 


Gain 






P 


III 


Control 


32.5 


34.6 


2.1 


16 


.69 




III 


Reg. HS 


45.8 


52.7 


6.9 


39 


2.43 


.025 


III 


Alt. HS 


46.7 


41.7 


-5.0 


28 


1.60 




III 


Dropout 


48.8 


.16.1 


7.3 


16 


2.19 


.025 


IV 


Control 


36.5 


39.8 


3.3 


18 


.83 







TTT 


Con t ro 1 


36 .1 


41 .9 


5.8 


20 


2.21 


.025 


III 


Reg. HS 


32.9 


41 .9 


9.0- 


42 


5.46 


.001 




III 


Alt. HS 


32.3 


39.0 


7.0 


26 


2.91 


.005 




IV 


Control 


33.9 


45.0 


11.1 


26 


5.44 


.001 


c , 


III 


Cont rol 


41 .7 


47.0 


5.3 


14 


1.88 


.05 




III 


Reg. HS" 


48.3 


55.3 


7.0 


51 


3.27 


.005 




III 


Alt. HS 


57.4 


56.7 


- .7 


8 


.09 






IV 


Control 


29.8 


30.2 


.4 


12 


.12 




D- 


III 


Control 


31.0 


34.6 


3.6 


16 


1.36 






IV 


Control 


32.4 


37.4 


5.0 


50 


2.18 


.025 


All 


III 


Control 


35.2 


39.5 


4.3 


66 


3.10 


.005 




III 


Reg. HS 


42.7 


50.3 


7.6 


132 


5.93 


.001 




III 


Alt. HS 


42.0 


42.5 


.5 


62 


.24 






III 


Dropout 


48.8 


56.1 


7.3 


16 


2.19 


.025 




IV 


Control 


33.2 


38.8 


5.6 


106 


3.94 


.001 



perhaps by the students themselves — so that only the students who 
had shown improvement completed the mid- and posttests. Since there 
were 2fll students in the regular high school group at pretest time 
and only 161 apd 132 at mid- and posttest times respectively, this 
explanation is at least possible, if not particularly compelling. 

The large gain made by the dropout group must be interpreteB 
cautiously. With only 16 members in the group, the size of the gain 
could vary over a wide range. Although the differences were not 
tested, it is unlikely that the dropout group's gain is signifi- 
cantly different from that of the third-cohort treatment group 
either at Site A or overall. 

Tables 7 and 8 present the results of the control group compar- 
isons performed by means of covariance anat>sis (ANCOVA) on the 
reading test scores. The generally negative findings of these 
analyses are not inconsistent with those of the norm-referenced 



35 



55 



Table 7 

Treatment Group NCE Gains- in :Reading at Midtest Time: 
Estimates Derived from Co^ariance Analyses 



Site 


Cohort 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean - 


Gain 


N 




P 


A 


III 


Treat. 
Control 


35 .8 
35.5 


39.3 
36.9 


2.4 


• • 

32 
19 


..23 






IV 


Treat. 
Control 


31.0 
34.4 


38.1 
34.2 


3.9 


30 
27 


1.98 






Comb. 


Treat. 
Control 


33.5 
34.9 


38.7 
35.4 


3.3 


62 
46 


1.50 




x> 
D 


Ill 


Treat. 
Control 


38 .8 
35.2 


40.8 
40.7 


.1 


87 
25 


.01 






Ttr 
IV 


Treat . 
Control 


32 .0 
36.6 


36 .1 
40.5 


-4.3 


41 
32 


2.80 






Comb • 


Treat. 
Control 


36 .6 
35.9 


39.0 
41.3 


-2 .4 


128 
57 


1.38 




L 


TTT 

xll 


Treat. 
Control 


37 .9 
41.9 


42 .2 
40.1 


2.1 


47 
30 


.65 






IV 


Treat. 
Control 


38.2 
42.5 


41 .8 
37.7 


4.) 


53 
10 


l.ll 






Comb. 


Treat. 
Control 


38.0 
42.0 


42.0 
39.4 


2.7 


100 
40 


1.64 






III 


Treati 
Control 


32.5 
' 32.4 


37.4 
• 34.3 


3.1 


52 
15 


.57 






IV 


Treat. \ 
Control, 


29.2 
33.3 


29.9 
33.6 


-3.7 


77 
54 


2.74 


— 




Comb. 


Treat. 
Control 


1 30.5 

. 33.1 

>i 


32.8 
34.0 


-1.2 


129 
69 


.37 




All 


Ill 
IV 


Treat . 
Control 

Treat. 
. Control 


' 36.2 
: 37.0 

\ 

32.4 . 
35.2 


40-. 2 
38.2 

35.1 
36.5 


-1 .4 


89 

201 
123 


1.06 




\ 


Comb . 


Treat. 
Control 


34.6 
36.0 


37.6 
37.5 


.1 


419 
■ 212 


.01 





Table .8 

Xr garment G roup NCE Gains in Reading^ at Posttest Time: 
Estimates Derived from "CovariTrice Analysed 



Site 


Cohort 


Group 


Pretest 
Mean 


Adj. Post- 
test Mean 


Gain 


N 


F 


P 


A 


III 


Treat. > 
Control 


32.6 
32.5 


36.5 
34.7 


1.8 


22 
16 


1 O 

' .18 






IV 


Treat. 
Control 


34.6 
36.5 


38.3 
39.1 


- .7 


21 
18 


.02 






Comb. 


Treat. 
Control 


33.6 
34.6 


37 .5 
36.9 


• 0 


A 1 

34 


.03 




B 


Ill 


Treat. 
Control 


41 *4 
36.2 


A 7 A 

44.3 


1 1 
J. X 


^(\ 
j\j 

20 








IV 


Treat. 
Control 


32.4 
33.9 


41 .4 
44.2 


O Q 

-2 .o 


32 
26 


1 .oy 






Comb* 


Treat. 
Control 


37.9 
34.9 


44.7 
45.2 


- '.5 


OO 

82 
46 


.04 




c 


III 


Treat. 
Control 


34.0 
41.7 


42.5 
42.8 


- .3 


21 
14 


.00 




- 


IV 


Treat. 
Control 


40.0 
29.8 


45.9 
37.0 


8.9 


34 
12 


5.68 


.01 




Comb. 


Treat. 
Control 


37.4 
36.2 


44.6' 
40.1 


- 4.5 


55 
26 


2.43 




0 


Ill 


Treat. 
Control 


34.8 
31*0 


4i .3 
36.6 


4.7 


33 
16 


i .Oo 






IV 


Treat. 
Control 


30.5 
32.4 


40.9 
36.4 


4.5 


0/ 

50 


\ O CO 

\ 2 . 5 2 






Comb* 


Treat. 
Control 


31.9 
32.1 


41 .0 
36.6 


4.4 


lUU 

66 






All 


III ^ 


Treat. 
Control 


36.9 
35.2 


42,8 
40.4 


.2.4 


126 

- 66 


1.16 






IV 


Treat* 
Control 


33.4 
33.2 


41.7 
39.0 


2.7 


154 
106 


2.56 






Comb. 


Treat. 
Control 


35.0 
34.0 


42.2 
39.6 


2.6 


280 
172 


3.47 


.05 



37 



analyses. The gains are highly similar, in fact, to what would be 
obtained by subtracting the norm--referenced gains of the control 
groups from those of corresponding treatment groups. 

No statistically significant ANCOVA gain estimates were fjjund 
at midtest time. At posttest time only 2 of the 12 individual-site 
analyses produced statistically significant gain estimates, although 
the overall (across sitets, across cohorts) estimate is also signifi- 
c^''*^. This last finding, of course, is the most important as it 
verifies that treatment group students, on the average, outperformed 
control group students. ' 

There is some reason to believe that the gain estimates derived 
from the ANCOVAs may be biased. Apart from the very high attrition 
rates in both treatment and control groups (which could have led to 
systematic differences between groups), the fact that all members of 
all control groups applied for, but were denied admission to the 
CIP may have had some effect on their motivation. Indeed, it seems 
likely that the so-called John Henry effect (Saretsky, 1972) may 
have been operating and may have artificially inflated the gains 
made by the various control groups. 

Tables ' 9 and 10 present the results of the comparison group 
analyses derived through use of standardized gain procedures. As 
was the case with the covariance analyses, none of the gains was 
found to be statistically significant at micjtest time. At posttest 
time, only one of the individual-site anrf^bne of the across-site 
estimates was significant. It is interesting to note that the one 
significant across-site gain involve^ the^ alternative high school 
comparison group, suggesting that the CIP is putper forming other 
programs serving similar youths. 

Site B is, not surprisingly, an exception to this general 
trend. The entire alternative high school group at Site B was 
enrolled in a single program that provides intensive remedial read- 
ing instruction-* 

Overall, the comparison group analyses were marred by large 
initial differences between treatment and comparison groups. Al- 
though every effort was made to select low achievers, it is clear 
that this goal was only achieved at Site B '(where, in fact, our 
efforts were somewhat too successful). At Sites A and C most com- 
parison groups are only slightly below the national median (an NCE 
of 50) and one is substantially above it. With differences as large 
as these, any attempt ac statistical equating requires assumptions 
of heroic proportions. 

Tables II and 12 present the results of the matched-pairs anal- 
yses. While, in theory, these analyses might have provided the best 
insights relative to program impact , high attrition produced ex- 
tremely small sample sizes. As a result, only Site C shows a sig- 
nificant gain at midtest time and only Site D at posttest time. The 
across-site gain at posttest time is also significant for the third 
cohort. 



3« 58 



Table 9 

Treatment Group NCE Gains in Reading at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 


P 


A ^ 


Reg. HS 


46.7, 


47.2 
44.5 


2.7 


32 
55 


.46 






Alt. HS 


35 8 
45.6 


46 .6 
42.4 


4.2 


32 
50 


1.11 

/ 






Treatment 
Dropout 


3,5.8 
47.6 


45.1 
. 36.8 


8.3 


32 
19 






B 


Treatment 
Reg. HS 


Jo . O 

32.7 


41-. 0 


—9 n 
. u 


ft7 
0 / 

51 


• oo 






Treatment 
Alt. HS 


38.8 
40.6 


42.2 
43.1 


- .9 


87 
54 


.16 




. C ' 


Treatment 
Reg. HS' 


J7 .9 
45.9 


/•^; 

48.6 


— 1 1 

— J . 1 


A 7 

55 


1 Aft 






. Treatment 
Alt. HS 


J/ . y 
57.1 


AO 7 

45.8 


1 ft 


A7 

39 






All 


Treatment 
Reg. HS 


38.0 
42.0 


43.0 
44.1 


i 

-1.1 


166 
161 


.46 






Treatmj^nt 
Alt. HS 


38.0 
46.9 


45.3 
43.4 


1.9 


166 
143 


1.42 






Treatment 
Dropout 


35.8 
47.6 


45.1 
36.8 


8.3 


32 
19 


l.lk 





I 



39 



ERIC 



' Table 10 ^ 
Treatment Group NCE Gains in Reading at Posttest Time; 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretest 
Mean 


Ad 1 Pofit!-* 

test Mean 


Gain 


N 


F 


P 


A 


Treatment 
Reg. HS 


32.6 
45.8^ 


46 7 
47.0 




99 

39 


nn 

. \J\J 






Tt"0 a f*Tnon f* 
4 L c ct L ut c 1 1 L 

Alt. HS 


46.6 


A7 1 

33.4 


1 ^ 7 


28 


Q 1 n 


• UUD 




Treatment 
Dropout 


32.6 
48.8 


45.5 
43^.7 


1.8 


22 
16 


.16 




o 


X I. C CSLIUC 11 L. 

Reg. HS 


32.9 


fo 47.3 


— J . J 


jU 

42 


Q7 






Treatment 
Alt. HS 


41.4 
32.3 


44.9 
45.8 


- .9 


50 
26 


.05 




c 


Treatment 
Reg: HS 


34.0 
48.3 


50 2 
50.9 




91 

51 








Treatment 
Alt. HS 


34.0 
57.4 

1 


48 .0 
34.9 


13.1 


21 
8 


2 87 




All 


Treatment 
Reg. HS 


37.6 
42.7 


47.0 
47.9 


- .9 


93 
132 


.19 






Treatment 
Alt. HS 


37.6 
42.0 


■ 45.8 ■ 
39.3 


6..5 


93 
62 


5.23 


.01 




Treatment 
Dropout 


32.6 
48.8 


45.5 
43.7 


1.8 


• 22 
16 


.16 





40 SO 



Table 11 

Treatment Group NCE Gains in Reading at Midtest Time: 
Estimates Derived from Matched Pairs Analyses 







Mean Mid- 


-^Mean Mid- 














test NCE 


test NCE 


NCE 








Site 


Cohort 


Treatment 


Cortrol - 


'Gain 


N 


t_ 


P 


A 


III 


33.9 


26.4 


7.5* 


7 


1.05 






IV 


39.6 


34.6 


4.9 


7 


1.07 






Combined 


36.7 


30.5 


6.2 


14 


1.53 


MM 


B 


III 


35.9 


37.2 


-1.3 


23 


.39 






IV 


- 43.3 


48.1 


-4.8 


13 


.86 






Combined 


38.6 


41*2 


-2.6 


36 


.88 




C 


III 


42.6 


36.4 


6.2 


17 


1.81 


.05 




IV 


48.2 


41 .6 


6.7 


9 


1.07 






Combined 


44.3 


38.2 


• 6.3 

< . 


, 26 


2.10 


.025 


D 


III 


38.1 


34.1 . 


4.0 


13 


1.13 






IV 


29.9 


32.7 


-2.8 


23 


.78 






Combined 


32.9 


33.2 


- .3 


36 


.12 




All 


Sites III 


38.0 


35.0 


3.0 


60 


1.51 






IV 


37.7 


38.4 


- .6 


52 


: .25 






Combined 


37.9 


36.6 


1.3 


112 


.83 





By far the most positive results with respect to reading 
achievement are qj>served in the norm-referenced analyses. Sur- 
prisingly, however, the norm-referenced gain estimates for most of 
the control and comparison groups are also positive rather than zero 
as might have been expected (at least for the regular high school 
comparison groups). If these control and comparison group gains are 
"real," then the norm-referenced analyses produced the most valid 
gain estimates. The possibility must be acknowledged , however, that 
these gains are no more than artifacts -of the norm-referenced pro- 
cedures employed in the evaluation. 



41 



Table 12 

Treatment Group NCE Gains in Reading at Postt.est Time: 
Estimates Derived from Matched Pairs Analyses 



* 




Mean Post- 


Mean Post 














test NCE 


test NCE 


NCE ' 








si ^ p 


\j\JH\J L If 


X I^caLulcnL 


wUnL lU JL 








P 


A 


III 


38.7 


29.9 


8.8 


2 


.58 


-- 


• 


IV 


39.9 


32.6 


7.4 


5 


.98 


— — 




Cnmh 1 npd 


39 6 


31.8" 


7 ft 




1 97 


• 


B 


III 


41.6 


42.4 


- .8 


14 


.16 


— 




IV 


41.3 


52.9 


-11.6 


9 


1.90 


— — 


• 




41 5 








1 2 1 




















C 


III 


56.0 


56.2 


- .2 


4 ' 


.02 


— 




IV 


39.3 


27.8 


11.5 


5 


1.16 






' Combined 


46.7 


40.4 


6.. 3 


9 


.98 




D 


III 


55.7 


43.3 


^ 12.4 


7 


1.85 






IV^ 


45.6 


39.9 


5.6 


17 


1.18 






Combined 


48.5 


40.9 


. 7.6 


24 


1.96 


.05 

1 


All 


Sites III 


47.2 


43.8 


3.4 


27 / 


3.39 


.005 




) IV 


" 42.9 


40.5 


2.4 


36 / 


.70 






Combined 


•44.7 


41,9 


2.8 


63 


1.14 





As mentioned earlier, it was necessary to implement the norm- 
referenced model in a somewhat unorthodox manner. Two specific 
deviations from sjiandard implementation procedures could have in- 
troduced ^some distortions. The first deviation was the extrapola- 
tion and interpolation of normative data to accommodate the flexible 
testing schedule imposed on the study by various practical con- 
siderations. The second deviation was the assignment of students to 
grade-level norms on the basi? of age rather than their actual grade 
placement.. Either of these procedural variations could have intro- 
duced bias into the analyses. The authors, however, are unable to 
generate a plausible explanation as to why the bias should have b3en 
consistently positive regardless of testing times or type of group. 
We are inclined instead to favor the hypothesis that the norm- 
referenced gain estima*:es are accurate and that the gains apparently 
made by the control and comparison groups resulted from some com- 
bination of the John Henry effect and a selection bias. In any 
case, it should be remembered that, overall, the treatment group 
significantly, outperform^ the control group and the alternative 
high school group. 




M ath • ' ' ^ ' . 

Tables 13 and 14 summarii.** the pre-to-midtest and pre-to- 
posttest norm-referenced analyses of mathematics test scores. As 
can be seen, nearly half of the gain estimates at midtest time are 
statistically significant and a majority are significant at posttest 
time. Combined across sites and cohorts, the mean pre-to-midtest 
gain is 2.2 NCEs and the pre-to-posttest gain is 4.3 NCEs. The 
latter gain is somewhat smaller than that observed for reading 
achievement, a finding that is readily explainable in terms of the 
difficulty all of .the^sites experienced in hiring -''and retaining 
qualified math instructors. 



Table 13 

Treatment Group Pre-to-Midtest NCE Gains i^pMath: 
Estimates Derived from Norm-Referenced Analyses 







Pretest 


Miatest 










Site 


Cohort 


NCE Mean 


NCE Mean 


Gain 


N 




P 


A 


II 


31 .Z* 


Qo c: 


1 Q 


9 1 


. JO 






III 


1 o o 

19.^2 


ZO . 3 




19 


0 1 A 






iV 


Z3 . Z 


97 A 


9 




7Q 






Combined 


24.4 


28.4 


4.0 


83 


2.01 


.025 


B 


II 


23.4 


24.9 


1.5 


ho 


1.49 






III 


27.3 


. 30.0 


2.7 


87 


1.74 


.05 




IV 


24.9 


27.2 


2.2 


41 


1.58 






Combined 


25.8 


28.1 


2.3 


168 


2.45 


.01 


C 


II 


31.6 


30.7 


- .9 


28 


.32 






III 


31 .0 


34.7 


3.6 


46 


2.59 


.01 


r 


IV 


31.2 • 


31 .8 


.7 


53 


.45 






Combined 


31 .2 


32.6 


1.6 


127 


1.36 




D 


II 


26.0 


30.2 


4.2 


14 


1.41 






III 


23.7 


26.9 


3.2 


48 


2.37 


.025 




IV 


23.6 


23.8 


.3 


77 


.19 






Combined 


23.9 


25.5 


1.7 


13.9 


1.77 


.05 


All 


II 


27.6 


28.8 


1.2 


103 


.91 






III 


26.1 


29.8 


3.7 


213 


4.03 


.001 




IV 


26.1 


27.2 


1.1 


201 


1.29 






Combined 


26.4 


28.6 


2.2 


517 


6.91 


.001 



43 

63 



Table 14 

.Treatment Group Pre-to-Posttest UCE Gains in Math: 
Estimates Derived from the Norm-Referenced Analyses 







Pretest 


Posttest 


NCE 








Site 


Cohort 


NCE Mean 


NCE Mean 


Gain 




• t 


V 

t. 


A 


II 


30.6 


36.8 


6.3 


18 


1.75 


.05 




III 


16.1 


30.1 


, 14.0 


22" 


3.28 


.005 




IV 


23.6 


22\3 


- 1.3 


21 


.61 






Combined 


23.0 


29.4 


6.4 


61 


3.0 


.005 


B 


II 


20.8 


24.7 


3.9 


15 


1.29 






III 


27.1 


36.0 


8.9 


50 


4.42 


.001 




IV 




25.4 


- .3 


32 


.11 






Combined 


25 7 


30.8 


5 1 


97 


3.66 


001 


C 


II 


^7.5 


24.2 


- 3.2 


9 


.97 






III 


{28.9 


29.1 


.2 


21 


.07 






IV 


32.2 


38.5 


.6.3 


34 


2.92 


.005 




finmh iiiprf 

tUl/ A lie u 


30 5 




3 0 






OS 


D 


II * . 


20.9 


29.7 


8.8 


6 


i.68 






Ill 


25.5 


30.5 


5.0 


32 


2.45 


.025 




' IV 


24.0 


25.8 


1.7 


67 


1.00 






Combined 


24.3 


27 A 


3.1 


65 


.26 


— 


AM 


II 


25.8 


29.8 


4.0 


48 


2.14 


.025 




III 


25.1 


32.4 


7.3 


125 


5.44 


.001 




IV 


26.1 


28.0 


1.9 


154 


1.81 


.05 




Combined 


25.7 


30.0 


4.3 


327 


5.52 


.001 


The 


pre-to-midtest results, when 


combined 


across 


sites , 


show 



that the smallest gain was made by fourth-cohort 'studei^its* Again it 
seems likely that this finding -is best, explained by the short pre- 
to-^idteet interval for this cohort. The largest gain was made by 
third-cohort students — a not surprising outcome in view of the fact 
that there were fewer implementation problems during the time period 
in question than was the case during the tenure of either the 
second- or fourth-cohorts. What is surprising is that the largest 
gain was made at Site A. However » despite other problems at that 
site^ it did have an excellent math teacher. 

The 'authors wc^re initially somewhat concerned about the quite 
low mean pretest score for the third-cohort group at Site A — 
especially since ^he correspondi^ng^scv.re fdr t'he control group is 
11.5 NCEs higher. Initially we thought that there might have been a 
few invalid scores that would not only account for the low pretest 
mean but also for the large gains both from pre- to midtest and from 



44 



pre-* to posttest. Examination of the raw data, however, revealed no 
such problem. The difference between the pretest scores of the 
treatment and control groups appears to be the result of high- 
scoring isembers of the treatment ^gruup failing to enroll in the CIP 
, or dropping out before midtest time. (The pretest means of the two 
groups prior to attrition were 25.8 and 26,1 NCEs respectively.) 
The reality of the pre- to midtest gain, furthermore, is attested to 
by the continued growth from mid- to posttest which could not result 
from invalid pretest scores. 

When the data are combined across cohorts wfthin each site, 
Site A emerges with the largest pre-to-midtesc and pre-to-posttest 
gain. In both cases, the third cohort is primarily responsible. 
The outstanding math instructor was not hired until some time after 
the second cohort enrolled and left before the fourth cohort entered 
the program. 

The trend toward improvement over time that was observed in 
reading at both Sites C and D is seen only at Site C in math. At / 
Site D the gain made by the fourth cohort was less than that made by 
the third. This reversal is attributed .to the departure of the 
site's excellent science teacher who also often taught math classes. 
His departure more than offset the general improvement in climate 
that was* reported earlier. 

The pre-to-posttest results show much the same pattern that was 
observed at midtest time. However, when summarized across sites, 
the gains made by all three cohorts are statistically significant. 
The 3 NCE gain made by second-cohort students at Site A adds further 
credibility to the effectiveness of the math instructor at that 
site. She joined the program just before the second cohort was 
midtested. Similarly, (he 6.2 NCE gain made by third-cohort interns 
at Site B between mid- and posttests can be attributed to the fact 
that a well qualified id talented math instructor was finally hired 
at that site. Unfortunately, he left again after only six months. 

Overall, the norm-referenced resulr.s are encouraging. They 
also suggest that larger gains wouTd have occurred had math teaching 
positions been vacant less often. In any case, the 327 student^s who ^ 
had both pre- and posttests moved from a national percentile rank of 
12.4 to 17.1. It is perhaps noteworthy that the math achievement of 
CIP students fis substantially below the lev^l in reading. 

Tables 15 and 16 present summaries of the norm-referenced 
analyses performed on control and comparison group math achievement 
data. Only a few of these Rain estimates are stat ist ically sig- 
nificant and most of them are smaller than those made by the corre- 
sponding t reatment groups . Summarized across sites , none of the 
control or comparison group gains at midtest time exceed those made , - - 
by the corresponding treatment group. The same situation prevails 
at posttest time, with the single exception that the fourth-cohort 
control group outgained the treatment group by .2 NCEs. 



45 



65 



Table 15 

Control and Comparison Group Pre-to-Midtest NCE Gains in Math: 
Estimates Derived from<Norm-Referenced Analyses 









Pretest 


Midtest 


NCE 






* 


Site 


Cohort 


Group 


NCE Mean 


HCE Mean 


Gain 


N 


jt 


' P 


A 


III 


Control 


30.7 


28.7 


-2.0 


19 


.59 


— 




III 


Reg. HS 


41.3 


48.1 


6.8 


54 


2.33 


.025 




III 


Alt. HS 


37.6 


40.0 


2.3- 




.89 






III 


Dropout 


40.4 


42.5 


2.1 


19 


.62 





\ • 
— 


TV 


V/UUL xO X 




91 Q 


•"J.J 


07 


1 .'♦0 




B ^ 


III 


Control 


28.9 


34.7 


5.8 


25 


1.61 






III 


Reg. HS 


35.0 


36.8 


1.8 


51 


1.92 


.025 




III 


Alt. HS 


38.2 


37.1 


-1.0 


53 


.61 


— 




TV 


UUllL X U X 


7fl Q 

. 7 


97 7 






.0/ 




C 


III 


Control 


26 ;6 


• 24.8 


-1.9 


30 


.82 


— 




TTT 


cvckc . no 


U\ R 




9 1 


<^<^ 
J J 


i . Uo 






III 


Alt. HS 


48.5 


51 2 


? 7 


J7 


1 "KL 
1 . J4 






IV 


Cont ro 1 


32.0 


34 0 






' 81 

.OX 




D . 


HI 


Control 


29.1 


32.4 


3.3 


14 


.71 


— 




TtT 

IV 


Control 


26 .4 


27.3 


.9 


54 


.42 




All 


III - 


. Control 


28.5 


29.6 


1.1 


88 


.67 






III 


Reg. HS 


c39.5 


43.1 


3.6 


160 


2.92 


.005 




III 


Alt. HS 


40.8 


42.0 


1.2 


142 


.93 






III 


Dropout 


40.4 


42.5 


2.1 


19 


.62 






- IV 


Control 


27.3 


26.7 


- .5 


123 


.43 





At posttest time, the smallest gain (-3.7 NCEs) was registered 
by the alternative high school comparison group, suggesting a real 
superiority of the CIP compared to other like programs. This dif- 
ference is most marked at Site A where the CIP is only one of sev- 
eral alternative programs in the school district. 

The t rea*tment-contro I analyses performed using covariance 
analysis are summarized in Tables 17 and 18. Only three of the gain 
estimates are statistically significant at midtest time and none is 
significant at posttest time. The larger treatment group gains made 
by third-cohort students at Site B and by fourth-cohort students at 
Site C are largely offset by the sizeable gains registered by the 
corresponding control groups. Again, selection biases and John 
Henry effects may have been operative. 



46 



/ 



Table 16 

Control and Comparison Group Pre-to-Posttest NCE. Gains in Math: 
Estimates Derived from Norm-Referenced Analyses 









Pretest 


?08CCe8C 


NCE 








Sice 


Cohort 


Group 


NCE Mean , 


, NCE Mean 


Gain 


N 




P 


A 


• III 


Control 


• 

29.4 


31.6 


2.2 


16 


2.19 


.025 




III 


Reg. HS 


42.2 


42.8 


.6 


39 


.29 


:: 




III 


Alt; HS 


42.6 


33.1 


-9.5 


28 


2.39 






III 


Dropout 


39.6 


41.3 


1.7 


16 


.37 


— 




Ttf 

IV 


Control 


Z.O .£ 


JZ .O 


A 7 


1 


Q 1 




B 


III 


Control 


26.4 


30.5 


4.1 


20 


2.95 


.005 




III 


Reg. HS 


37.5 


42.7 . 


5.2 


42 


3.29 


.005 


• 


III 


Alt. HS 


35.5 


36.7 


1.2 


26 


.42 






Ttf 

' IV 


Control 




Ji .0 


9 

Z .0 


9A 
ZO 


1 1 s 




C 


III 


Control 


33.2- 


37.5 


4.3 


13 








TTT 
ill 


Keg • no 


A9 'A 
*tZ . 0 


*t J . 0 


1 n 




60 






TTT 


A1 ^ HQ 


*TW . Mm 


47 2 


r 0 


8 


.31 






TV 


VA^nL roi 






S 6 




1 .44 




D , 


III 


Control 


25.3 . 


27.1 « 


1.8. 


15 


.41 






IV 


Control 


26.0 


. 




50 






All 


III 


Control 


28.3 


31.5 


3.2 


65 


1.93 


.05. 




III 


Reg. HS 


40.9 


43.1 


2.2 


132 


2.12 


.025 




III 


Alt?. HS 


40.1 


36.4 


-3.7 


62, 


1.60 






III 


Dropout 


39.6 


41.3 


1.7. 


16 


.37 






IV 


Conttol 


26.9 


29.0 


2.1 _ 


105 


1.47 





At site A, where the third-cohort norm-referenced achievem(6nt 
gain is very large, it is somewhat surppirsmg to find tha^^he^ 
covariance analyses shows a substant^a>l7^ smaller 'and stati^LCcally 
non-significant gain — especial l^^since the control group's^ (norm- 
refereni:ed) gain is comparatively small (2.2 NCgs). Iti fact, the 
apparent inconsistenay^stems from the .U^ge difference between the ^ 
pretest scorLea-'or^e two groups. It is clear that the. two groups 
wex^«quivalent initially but experienced systematically different 
types of attrition. Students who remained in. the treatment group at 
mid- and pos.ttest times were clearly different from those who 
remained in the control group at the same times. ^ 

Real differe.ices between groups result in a systematic under- 
correction of posttest scores when traditional ANCOVA procedures are 
used (Campbell & Boruch, 1975). In this particular case at least, 
it appeared that a more valid gain estimate would be obtained using 
standardised gain analysis. When this was dope, a pre-to-tnidtest 

67 



Table 17 



ERIC 





Treatment Group mCE Gains in Math at Mid test Time: 
Estimates Derived from Covariance Analyses 






Site 


Cohort 


Croup 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 


P 


A 

• 


III 


Treat. 

Control/^ 


19.2 
30.7 


29.5 
23.7 


5.8 


32 
19 


1.37 






IV 


Treat. 
Control 


25.2 
25.4 


27.5 
21.8. 


5.7 ' 


30 
27 


2.40 






Comb« 

r 


Treat. 

Control 


22.1 
27.6 


'28.6 
22.4 


6.2 . 


62 
46 


4.35 


.025 


B 


TT T 
III 


Treat. * 

Control 

1,' 


Z7 .J 
28.9 


JO,. J 
33.9 - 


-3.6 


87 
25 


1.23 






IV 


Treat . 
Coqtrol 


24.9" 
28.9 


28.5 
76^0 


2.6 


^41 
32 


1.32 




* 


Comb. 


Treaty / 
Control 


26.6 
28.9 


29.6 
29*6 




128 
57, 






C 


VT T 
III 


JTreat ♦ 
Control 


Jl .0 
26.6 


33 .2 

. 27.0 

Si 


0.2 


46 
30 ° 


6.08 


.001 




. IV 


Treat.- ^ 
Control 


31.2 
32.0 


31.9 
33.5 


-1.6 


53 
10 


/ .24 - 






Conib. . 

1 ' 

1 


' Treat. 
Control* 


31.1 
, 28.0 


^ 32.4 
28.9 


3.S 


99 
40 


3.39 


.05 





















' D III 


Treat. 


23.7 


28.0 


- .9 


48 


.07 




Control 


29,.l 


.28.9 




• 14 




j " 


Treat. 


23.6 


24.7 


-1.3 


77 


.29 — - / 




Control 


26.4 


26.0 




54 


\ 


Comb . 


' Treat. 


23.6 


26.0 


- .7 


125 . 


. .12 




• Control , 


26.9 ' 


' 26.6 




68 




All III 


Treat. 


. 26.1 • 


- . 30.3 


2.0 


213 • 


1.34 




Control 


28.5 


28.4 




88 


IV 


Tfeat. 


■ 26.1- 


27.5 


K'3 


201 


.98 




Control 


27.3 


.26.2 




123 




Comb. 


Treat. 


26.1 


• 29.0 


1.9 


414 


! 

3.03 , .05-^ 




Control 


27.8 


^7.1 . 




211 





■ 7 



Table 18 

Treatment Group NCE Gains in Math at Posttest Time: 
Estimates Derived from Covariance Analyses 



site 


Cohort 


Group 


Pretest 
Mean 


Adj. Post- 
test Mean 


Gain 


N 


F 


P 


A 


III 
IV 


Treat. 
Control 

Treat. 
Control 


16 .1 
29.4 

23.6 
28.2 


33.6 
26.8 

23.6 

31.2 


6. .8 
-7 .6 


22 
16 

21 
18 


1 .28 
2 .32 






Comb. 


'Treat . 
Control 


19.8 
28.7 


28.6 
29.3 


- . 7 


43 
34 


.03 




fi 


Ill 


Treat. 
Control 


27 . 1 
26.4 


35 .8 
30.9 


4.9 


50 
20 


O Q 

2 . ^9 






IV 


Treat. 
Control 


25.7 
29.0 


26.6 
•30.4 


-3.8 


32 
26 


1 .45 






Comb . 


Treat . 
Control 


26.6 
27.9 


32.3 
30.6 


1.7 


82 
46 


.54 




C 


Ill 


Treat. 
Control 


28.9 
33.2 


30.7 
35.3 


-4.6 


21 
14 


1.02 






IV 


Treat. 
Control 


32.2 
24.3 


37.3 
33.8 


3.5 


34 
11 


.77 






Comb • 


Treat. 
Control 


30.9 
29.3 


34.5 
35.0 


- .5 


55 

25 


.03 




D 


III 


Treat. 
Control 


25 .3 
25.3 


30.5 
27.2 


3.3 . 


32 , 
15 


.68 






IV 


Treat. 
Control 


24 .0 
26.0 


26 .4 
25.2 


1 .2 


67 
50 


o c 

. 2d 






Comb , 


Treat. 
Control 


24.5 
25.8 


27 . 7 
25.7 


2.0 


99 
65 


.93 




All 
Ail 


T T T 
III 


Treat . 
Control 


28.3 


29.9 


«/ . «/ 


65 








IV 


Vreat. 
Control 


26.1 
26.9 


28.3 
28.7 


- .4 


154 
105 


.05 






Comb . 


Treat. 
Control 


25.6 
27.5 


30.5 
29.1 


1.4 


279 
170 


1.08 





49 

' 69 



k 



gain estimate of 9,6 NCEs was obtained . The pre- to-post test gain 
was 12.7 NCEs. Both estimates are statistically significant at the 
.05 level. 

Standardized gain analysis was a|so applied to the across-site 
comparison between third-cohort treatment and control groups. This 
approach raised the gain est imate f rota 3.3 to 4.6 NCEs and the 
latter value is significant at the .025 level. When third and 
fourth cohorts were combined, the standardized gain estimate rose to 
2 NCEs but remained statistically non-significant. 

Tables 19 and 20 present the results of the standardized gain 
analyses performed on treatment and comparison group data. At mid- 
test time, only one of the ten gain estimates is statistically 
significant. At posttest time, on the other hand, only two of ten. 
fail to attain stat ist ical significance. It shoulu be noted ^ how- 
ever, that the credibility of these highly positive results is suh^ 
stantially diminished by the very large pretest differences between 
groups. Although there is no reason to believe that the analysis 
methodology introduced biases in either direction, it is simply not 
very informative to make comparisons between groups that have so 
little in common. While the fact that the results are positive does 
provide some further evidence supporting the success of the CIP, the 
gain estimates theniselves appear badly inflated — particularly at 
Site A and across sites.' 

The matched pairs analyses, presented in Tables 21 and 22, are 
equally uniriformative. Only 1 of 30 is significant at the .025 
ievel — an event not unlikely to occur by chance. 



TabU 19 

Treatment Group NCE Gains in Math at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretest 
\ Mean 


Adj. Mid- 
test ^ean 


Gain 


N 


F 


P 


A 


Treatment 
Reg. HS 


19.2 
41.3 


42.1 
38.9 


3.2 


32 
54 


.44 






Treatment 
Alt. US 


19.2 
37.6 


38.7 
32.1 


6.6 


32 
50 


2.18 






Treatment 
Dropout 


19.2 
• 40.4 


34.2 
29.4 


4.8 


32 
19 


.87 




MJ 


Reg. HS 


27.3 
35.0 


33.1 
31.5 


1.6 


87 
51 


.53 






Treatment 
Alt. HS 


27.3 
38.2 


34.2 
30.2 


4.0 


87 
53 


2.74 




c 


Trpatmpn t 
Reg. HS 


31.0 
41.8 


41 .2 
38.4 


2.8 


46 
55 


1.21 




* 


Treatment 
Alt. HS 


31.0 
48.5 


43.7 
40.6 


3.1 


46 
39 


1.54 




All 


Treatment 
Reg. HS 


26.8 
39.5 


37.6 
35.9 


1.7 


165 
160 


.92 






Treatment 
Alt. HS 


26.8 
40.8 


37.6 
33.9 


3.7 


165 
142 


4.50 


.025 




Treatment 
Dropout 


19.2 
40.4 


34.2 
29.4 


4.8 


32 
19 


.87 





I 



Table 20 

Treatment Group NCE Gains in Math at Posttest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 







PreLest 


Adj. Post- 










Site 


Group 


Mean 


test Mean 


Gain 


N 


F 


P 


A 


Treatment 


1 a 1 
lb. 1 




l^ . 1 


22 


10. 






Reg* HS 


42.2 


33.1 




39 








Treatment 


16.1 


47.6 


28 . 3 


22 


20. 16 


.001 




Alt. HS 


42.6 


19.3 




28 








Treatment 


16.1 


40.8 


14.1 


22 


4.44 


.025 




Dropout 


39.6 


26.7 




16 






B 


Treatment 


27. 1 


41 . 6 


5.6^ 


50 


4. 10 


.025 




Reg. HS 


37.5 


36.0 




42 








Treatment 


27.1 


39.2 


8.7 


50 


5.73- 


.025 




Alt. HS 


35.5 


30.5 




26 








> xreacmenc 




'^ft ft 
JO . O 


— ft 

.0 


9 1 


• UO 






Reg. HS 


42.6 


39.6 




51 








1 reatmenc 




1A ft 


9 


9 1 


9*^ 






Alt. HS 


46.2 


32.3 




8 






A1 1 






42 q 


6 8 






nn 1 




Reg. HS 


40.9 


36.1 




132 








Treatment 


24.9 


40.0 


13.2 


93 


22.83 


.001 




Alt. HS 


40.1 


26.0 




62 








Tre&tment 


16.1 


40.8 


14.1 


22 


4.44 


.025 




DropOM^ 


39.6 


26.7 




16 








Table 21 

Treatment Group NCE Gains in Math at Midtest Time: 
Estimates Derived from Matched Pairs Analyses 







Mean Mid*" 


Mean Mid~ 














test NCE 


test NCE 


NCE 








Site 


Cohort 


Treatment 


Control 


Gain 


N 


t^ 


P 


A 


TTT 


17 2 


15.7 


1.5 


7 


.18 






TV 


27 .3 


26.2 


1 . 1 


7 


.13 


mmmm 




Combined 


22.2 


20.9 


1.3 


14 


.23 





B 


III 


32.0 


33.9 


-1 .9 


22 


.40 






IV 


31.9 


27.5 


4.4 


17 


1.30 






Combined 


31.9 


31.1 


.8 


39 


.27 





c 


III 


32.1 


25.2 


6.9 


13 


2.24. 


.025 




IV • 


23.1 


29.8 


-6'. 7 


5 


2.05 






Combined 


29.6 


26.4 


3.2 


18 


1.13- 


— 


D 


III 


34.2 


38.7 


-4.5 


5 


.48 






IV 


> 23.2 


27.5 


-4.3 


19 


1.10 






Combined 


25.5 


29.8 


/ -4.3 . 


24 


1.22 




All 


Sites III 


30.0 


29.3 


.8 


47 


.27 






IV 


26.9 


27.5 


- .7 


-48 


.29 






Combined 


• 28.4 


28.4 


.0 


95 


.02 





\ 



ERIC 



53 

73 



Table 22 

Treatment Group NCE Cains in Math at Posttest Time: 
Estimates Derived from Matched Pairs Analyses 







ne an rose 


nean rosL 


















NCE 








Site 


Cohort 


Treatment 


Control 


Gain 


N 




P 




TT T 




IS 6 


7 8 




65 






TV 


26 7 




-16 8 


3 


1.57 




— — 


Combined 


25.1 


29.6 


4.5 


'6 


.50 


— 


B ' 


III 


27.8 


33. 3 


- 5.5 


11 


1.24 






IV 


30.6 • 


35 .7 " 


-5.1 


10 


.66 






Combined 


29.2 


34.4 


- 5.3 


21 


1.26 


— 


c 


III 


49.2 


42.6 


6.6 


2 


1.56 






IV 


33.4 


26. 1 


7.3 


5 


.76 







Combined 


37.9 


30.8 


7.1 


7 


1.06 


— 


D 


III 


24.5 


32.4 


- 7.9 


6 


1.05 






IV 


19.1 


x^.3 


.8 


12 


.18 






Combined 


20.9 


23.0 


- 2.1^ 


18 


.58 




All 


Sites III 


28.3 


31.5 


- 3.2 


22 


.94 






IV 


26.1 


27.9 


- 1.8 


30 


.52 






Combined 


27.0 


29.4 


- 2.4 


52 


.97 






74 

54 



Career Development Inventory 

Table 23 presents the pre-to-midtest raw score gains* made by 
second-cohort students. Large gains were achieved on the GDI 
Planning scale by students at all sites at both mid- and posttest 
time. Except in two cases where the sample sizes were very small, 
these gains are all statistically significant. Much the same 
picture can be observed with the GDI Resources scale although the 
gains are somewhat smaller and four of them are non-significant. 



Table 23 

Treatment Group Pre-to-Midtest Raw Score Gains: 
Career Development Inventory, Second Cohort 





Pretest 


Midtest 












Mean 


Mean 


Ga in 


N 
n 


*• 
L 


P 


Site A 














Planning 


99.7 


115.6 


16.0 


22 


1.98 


.05 


Resources 


76.3 


82.6 


6.3 


22 


1.21 




Inf onnat ion 






1 7 




1 S3 




Site B 














Planning 


100.0 


121.0 


21.0 


37 


6.01 


.001 


^Resources 


82.0 


88.0 


6.0 


37 


1 .96 


.05 


Information 


11.7 


14.2 


2.5 


37 


3.06 


.005 


Site C 














Planning 


100.4 


114.1 


13.7 


28 


2.05 


.025 


Resources 


82.8 


90.9 


8.1 


28 


1.30 




Information 


12,0 


13.3 


1.3 


28 


1.81 


.05 


Site D 














Planning 


109.6 


129.5 


19.9 


15 


5.88 


.001 


Resources 


86.1 


95.9 


9.8 


15 


3.02 


.005 


Information 


14.5 


15.5 


1.0 


15 


1.06 




All Sites 














Planning 


' lOi.6 


119.5 


17.9 


102 


6.25 


.001 


Resources 


81.7 


88.9 


7.2 


102 


3.C7 


.005 


Information 


12.1 


13.9 


1.8 


102 


3.97 
-« 


.001 



55 



Table 24 

Treatment Group Pre-to-Posttest Raw Score Gains: 
Career Development Inventory, Second Cohort 





Pretest 
Me an 


Posttest 
Mfisn 


Gain 


NT 


*• 
L 


P 


Site A 














Planning 


102.6 


127.7 


25.1 


18 


4.24 


.001 


Resources 


75.4 


84.1 


8.6 


17 


1.32 




Information 


10.9 


14 . 3 


3.4 


1 o 

lo 


2. 70 


.01 


Site B 














Pi Ann 1 no 


QR 7 






IS 






Resources 


76.4 


87.7 


11.3 


15 


2.54 


.025 


Information 


12.9 


14.8 


1.9 


15 


1.61 




Site C 




• 










Planning 


101.2- 


115.3 


14.1 


9 


1.35 




Resources 


76.7 


92.6 


15.9 


9 


4.50 


.003 


Information 


11.7 


10.4 


- 1.2 


9 


.87 




Site D 














Planning 


99.0 


128.0 


29.0 


1 






Resources 


72.0 


124.0 


52.0 


1 






Information 


16.2 


19.0 


2.8 


6 


2.10 


.05 



All Sites 



Planning . 


100.9 


124.2 


23.3 


43 • 


6 08 


.001 


Resources 


75.9 


88.0 


12.1 


43 


3.63 


.005 


Information 


12.3 


14.3 


"2.0 


48 


2.85 


.005 



The CDI Information scale shows significant pre-to-midtest 
gains at Sites B and C and significant pre-to-posttest gains at 
^ Sites A and D. Across sites the gains on this scale are signifi- 
cant at both mid- and posttest times. 

i 

In the absence of both-normative data and control groups, no 
other analyses of these data appear worth undertaking. It is 
important to note, honr^er, that the analyses whi^h are repor**ed may 
be misleading. There would almost certainly be some growth over 
time without the CIP treatment. This growth, unfortunately, is 
inextricably confounded with whatever gains resulted from^ the 
treatment. 

Tables 25 and 26 present gain estimates^ and related statistics 
derived from covariance analyses of treatment and control group 
scores on the CDI Planning scale. Table 25 summarizes the pre-to- 
midtest finding's while Table 26 encompasses the pre-to-posttest 



56 



Table 25 
Treatment Group Raw Score Gains 
on the GDI Planning Scale at Midtest Time: 
Estimates Derived from Covariance Ans^lyses 



Sice 


Cohort 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 


P 


A 


III 


Treat . 
Control 


90.0 
114.8 


115.2 
105.6 


9.6 


26 
. 13 


1.26 


— 




IV 


Treat . 
Control 


103.3 
100.6 


105.6 
100.0 


5.7 


29 
26 


.45 


— 




Comb. 


Treat . 
Control 


97.0 
105.4 


110.4 
101.4 


9.0 


55 
39 


2.25 


— 


B 


Ill 


Treat . 
Control 


105.2 
99.6 


120.7 
111.1 


9.6 


80 
22 


4.67 


.025 




IV 


Treat . 
Control 


93.0 
95.5 


111.6 
103.9 


7.7 


40 
30 


2.81 


.05 




Comb. 


Treat . 
Control 


101. 1 
97.3 


117.3 
107.8 


9.5 


120 
52 


9.4 


.015 


C 


Ill 


Treat . 
Control 


95.6 
103.7 


110.7 
106.8 


3.9 


47 
29 


.56 


— 




IV 


Treat . 
Control 


103.7 
100.2 


102.3 
108.8 


-6.5 


52 
10 


.36 


— 




Cotnb. 


Treat . 
, Control 


99.9 
^102.8 


106.2 
107.7 


-1.5 


99 
39 


' ,09 


— 


D 


Ill 


Treat . 
Control 


106.3 
107.9 


121.4 
113.4 


6.0 


47 
15 


1.91 


— 




w 

IV 


Treat . 
Control 


110.1 
111.1 


118.2 
110.5 


7.8 


72 
50 


4.23 


.025 




Comb. 


Treat , 
Control 


108.6 
110.4 


1'9.6 
111.0 


8.6 


119 
65 


7.72 


.005 


All 


Ill 


Treat , 
Control 


101.2 
105.2 


117.8 
109.2 


8.6 


200 
79 


10.25 


.001 




IV 


Treat . 
Control 


103.8 
103.8 


110.7 
106.2 


4.4 


193 
116 


2.29 






Comb. 


Treat . 
Control 


102.5 
104.4 


114.3 
107.4 


6.9 


393 
195 


11.72 


.001 



57 

77 



Table 26 
Treatment Group Raw Score Gains 
on the GDI Planning Scale at Post test Time: 
Estimates Derived from Covariance Analyses 



Pretest Adj. Post- 



Site 


Cohort 


Group 


Mean 


test Mean 


Gain 


N 


F 


P 


A 


III 


Treat . 
Control 


91.7 
106.1 


.122.1 
112.3 


9.8 


18 
16 


.91 






IV 


Treat. 
Control 


98.9 
108.3 


113.5 
114.2 


- .7 


20 
16 


.01 






Comb. 


Treat . 
Control 


95.4 
107.2 


117. S 
113.4 


4.1 


38 
32 


.37 





III 



Treat . 
Control 



102.9 
99.1 



129.1 
105'. 6 



23.5 



45 12.13 
17 



.001 



IV 



Treat. 
Control 



94.3 
93.8 



112.4 
108.6- 



3.8 



32 
25 



.73 



Comb. 



Treat. , 
Control 



99.3 
95.3 



121.8 
108.1 



13.7 



77 10.96 

42 ' 



.005 



III 



Treat. 
Control 



91.4 
104.5 



117.1 
112.2 



4.9 



18 
13 



.23 



IV 



Treat . 
Control 



106.3 
88.0 



105.7 
94.1 



11.6 



34 
11 



1.50 



Comb. 



Treat. 
Control 



101.2 
96.9 



109.9 
103.4 



6.5 



52 
24 



.93 



III 



Treat . 
Control 



105.5 
106.1 



125.8' 
107.9 



17.9 



30 
14 



7.67 .005 



IV 



Treat. 
Control 



104.7 
109.1 



120.1 
114.4 



6.3 



61 

45 



1.44 



Comb. 



Treat . 
Control 



105.0 
108.4 



122.4 
112.9 



9.5 



91 
59 



5.18 .05 



All 



III 



Treat . 
Control 



99.9 
103.8 



125.2 
109.3 



15.9 



Ul 16.71 
60 



.001 



IV 



Treat. 
Control 



102.0 
102.6 



114.4 
110.6 



3.8 



147 1.40 
97 



Comb. 



Treat. 
Control 



101.1 
103.1 



119.1 
109.9 



9.2 



258 13.06 
157 



.001 



58 

78 



results* At midtest timei 5 of 12 individual-site gain estimates 
are statistically significant; at posttest timei only 4k. Across 
sites, the third-cohort and the combined third- and- fouifth-cohort 
gain estimates ai^e statistically significant* 

In several cases (e.g., Site A, third cohort, mid- and post- 
test; Site C, fourth cohort,' posttest), there are large pretest 
differences between tteatment and control groups which suggest the 
possibility that ANCOVA may be an inappropriate analytic approach. 
When standardized g^in analyses were undertaken, three of the gains 
that were nonsignificant in the ANCOVAs attained statistical signif- 
icance These gains are as follows: (a) Site A third-cohort 
midtest — 20.7 (F » 5.54, p < .025), (b) Site A combined third-and- 
fourth-cohort midtest~14.6 (F -| 4.44, p < -025), and (c) Site A 
third-cohort posttest — 22.95 ^ « 3.05, p < .05). None of the other \ 
non significant ANCOVA estimatesj attained significance when stan- ^ 
dardised gain analyses were undertaken, but all of the significant 
ANCOVA estimates remained so, lending increased credibility to 
.those findings. 

There do not appear to be any meaningful differences among 
sitet^. On the other hand, the difference between third and fourth 
cohorts does appear meaningful. Except at Site C (where the fourth- 
cohort ANCOVA gain estimate at posttime is distorted by the very low 
pretest score of the control group), the same pattern is evident 
that is seen in the across-site comparisons. The lower fourth- 
( cohort gain is^ attributable to the fact that student-counaelor 
interactions were less frequent during the extension portion of the 
demonstration period than during the first two years. This reduc- 
tion, in turn, is due to a number of career counselors leaving the 
program and others becoming overloaded with the paperwork created by 
large fourth-cohort enrollments, the inclusion of additional school 
districts in the recruitment/catchment area, and related problems. 

'Tables 27 and 28 summarize the standardized gain analyses 
perilformed on treatment «nd comparison group CDI Planning scores at 
midtest and posttest times respectively. Most of the gain estimates 
are both large and statistically significant both at midtest and 
posttest time. No clear patterns emerge with respect either to 
sites or comparison groups. There does, however, appear to be some 
continued grow**;* from raid- to posttest. 

Tables 29 and 30 summarize ^the ANCOVA results for the CDI 
Resources scale at mid- and posttest times respectively. At midtest 
time only one individual-site and none of the across-site gain 
estimates is statistically significant. As was the case with the 
CDI Planning scale, however, there are substiantial pretest dif- 
ferences bet*^een treatment and control groups in a number of in- 
stances, suggesting that standardized gain analyses might yield more 
valid gain estimates than covariance analyses. When such analyses 
were carried out, the third-cohort gain estimate at Site C in- 
creased to 12.6 and became statistically significant (F « 6.74, p < 



59 



79 



Table 27 

Treatment Group Raw Score Gains 
on the CDI Planning Scale at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



i 


• 


Pretest 


Adj. Mid- 












Group 


Mean 


test Mean 


Gain 


N 




P 


A 


Treatment 


90.0 


123.5 


18.3 


26 


7.69 


.005 




Reg. HS 


106.4 


105.2 




56 _ 






\ 


Treatment 


90.0 


130.0 


34.3 


26 


20.37 


.001 




Alt. HS. 


112.1 ~ 


95.7 




49. 






\ 


Treatment 


90.0 


117.2 


19.2 


26 


5.92 


.025 


\ 


Drnnoii ^ 


106 .8 


98 .0 




18 






B 


'Treatment 


105.2 


120.6 


10. '3 


80 


7.69 


.005 




\Reg. HS 


103.2 


110.3 ■ 




53 








treatment 


105 .2' 


. 122.0 


14.5 


80 


11.21 


.001 




Alt. HS 


106.3 


107.5 




52 






C 


Treatment 


95.6 


86.2 


2.3 


47 


.32 






ReA. HS 


• 57.6 


83,9 




55 








Tredtm6nt 


95.6 


88.3 


11.5 


47 


4.87 


.025 




Alt.\ HS 


52.4 


76.8 




39 







Treatment 


99.7 


109.9 


'}.5 


153 


14.30 


.001 


Reg. HS 

\ 

Treatmlent 


89.0 


100.3 




164 






99.7 


U2.5 


17.2 


153 


33.95 


.001 


Alt. 

\ 

Treatment 


93.3 


95.3 




140 






90.0 ■ 


117.2 


19.2 


26 


5.92 


.025 


Dropout 


106.8, 


98.0 




18 







•01). The third-cohort, across-site gain also increased and- became 
statistically significant (gain » 8.0, £ « 9|«81, p C .005), as did 
the combioed thiird-and-fourth cohort, across*-site estimate (gain = 
4.6, F = 6.78, p ,< .005). ^ , 

.The situation is somewhat more positive at posttest time, with 
two sites showing statistically significant^ ANCOVA gain estimates 
for one of the two cohorts as well as for the two-cohort combina- 
tion. Across sites, the third-cohort and the combined third-and- 
fourth-cohort gain estimates are statistically significant ^ This 
pattern, with the fourth-cohort gain nonsignificant , matches that 



60 

so 



Table 28 
Treatment Group Raw Score Gains 
on the GDI Planning Scale at Post test Time: 
Estimates Derived from Standardized Gain. An'alyses, Third Cohbrt 







Pretest . 


Adj. Post- 










Site 


Group 


Mt^an 


test Mean 


Gain 


N 


F 


P 


A 


Treatment 


91.7 


128.8 


15.6 


18 


2.88 


.05 




Reg. HS 


i02.1 


113.2 




39 








•Treatment 


91.7 


132.9 


34.1 


18 


7.84 


.005 




Alt. HS 


106.3 


9^.8 




28 








Treatmei*.. 


>- 91.7 


123.7 


22.7 


18 


2.14 


— 




Dropout " 


97 .0 


101 .0 




lo 






B 


Treatment 


' 102.9 


127.6 


15.3 


45 


9.35 


.005 




Reg. HS 


.99.6 


112.3 




41 








Treatment 


102.9 


128.5 


10.0 


45 


2.85 


.05 




Alt. HS 


100 .u 






Z J 






C 


Treatment 


91.4 


132.1 


15.6 


18 


4.91 


.025 




Reg. HS 


10^.4- 


116.5 




51 








Treatment 


91.4 


120.5 


11.4 


18 


.75 






Alt. HS 


103.1 


109.1 




8 






All 


Treatmenr 


97.8 


128.9 


14.8 


81 


15.52 


.001 




Reg. HS 


104.2 


114.1 




131 








Treatment 


97.8 


127.6 


19.1 


81 


12.80 


.001 




Alt. HS 


103.6 


108.5 




61 








Treatment 


91.7 


123.7 


22.7 


18 


2.14 






Dropout 


97.0 


. 101.0 




16 







observed in the GDI Planning ANGOVAs. Again, the pattern is attrib- 
utable to the lessened counselor contact available to fourth-cohort 
students. 

Standardized gain analyses performed on GDI Resources posttest 
data raised most of the gain estimates but only one nonsignificant 
ANGOVA est imate attained stat ist ica 1 s igni f icance . That was the 
fourth-cohort estimate at Site A which increased from 6.4 to 10.2 
raw score points (F « 3.29, p < .05). 



61 

81 



Table 29 . 
Treatment Group Raw. Score Gains 
on the GDI Resources Scale at Midtest Time: 
Estimates/ Derived from Covariance Analyses 



site 


Cohort 


Group 


Pretest 
Me-n 


Adj. Mid- 
' test Metin 


Gain 


N 


F 


P 


A 


^11 1 , 


Treat. 
Control 


76.9 
87.5 


81.1 ■ 
*83.5 . 


-2.3. 


26 
13 




— 


• 


IV 


Treat. 
Control 


79.2 
r 83.6 


80.9 
78.5 


.2.4 


29 
26 


.14 


— 




Comb* 


Treat . 
Control 


78.1 
84.9 


81.2 
79.9 ' 


1.3" 


55 

r 39 


.08- 


— 


B 


III ; 


• Treat. 
Control 


78.8 
. 83.2 


90.1 
90.5 


^- .4 


80 
23- 


.oi 


— 


• 


IV 


Treat. 

Control 


80.3-^ 
79.3 


90.3 
'. 90.3 


.a 


40 ' 
- 30 


jOO 






Comb. 


Treat. . 
Control 


79^.3 
81.0 


90.2 ' 
90.3- 




120 
53 


.00" 


— 


C 


Ill 


• Treat. 
Control 


75.1 
84.7 


85.6 
80.7 


4.8 ■ 


.' 47 
29 


1.19 




• 


IV 


^ Treat. 
Control 


74.4 
72.7 


77.6 
76.9 


.7 


52 
IQ 


.01 






Comb J 


Treat. 
Cojitrol 


74.7 
81.6 


81.5 
79.5 


2.0 


99 
39 


.28 


— 


D 


III 


Treat. 
Control 


78.5 
82.-5 


89.0 
. 81.0 


7.9 


47 
15 


4.30 


.025 




IV 


Treat. 

^Control 


83.8 
85.9 


86.6 
86.9 


- .4 


73 
50 


.02 . 


— 




Comb. 


Treat. 
Control 


81.7 
85.1 

<. 


87.6 
85.4 


2.3 


120 
65 


1.17 


— 


Air 


Ill 


Treat. 
Control 


77;6 
84.3 


87.5 
; .84.3 


3.2.. 


200 
80 


2.15 






IV 


Treat. 
Control 


79.9 ' 
82.5 


84.4 
84.5- 


, - .1 


194 
116 


.00 






Comb . 

/ 


Treat. 
Control 


78.7 
83.2. 


86.0 
84 ..4 . 


1.7 


•394 
196 

0 


1.21 





^ Table 30 I 

Treatment Group Raw Score Gains' 
on the GDI Resources Scale at Post test Time: 
Estimates Derived from Covariance Analyses 



/ 





Site 


Cohort 


Group 


Pretest 
Mean 


Adj. Post- 
test Mean 


Gain 


N 


F 


P 


f 


A 


III 


Treat . 
Control 


74.1 
78.3 


84.4 
81.9 


2.5 


18 
16 


.22 








IV 


Treat . 
Control 


76.8 
85.4 


87.5 
81.1 


6.4 


20 
17 


1.48 




/ ^ 




Comb. 


Treat . 
Control 


75.5 
82.0 


85.6 
82.0 


3.6 


38 
33 


.89 




1 


B 


Ill 


Treat . 
Control 


78.1 
7i.O 


95.9 
92.1 


3.7, 


45 
17 


.60 








■ IV 


Treat . 
Control 


77.9 
76.9 


90.4 
83.5 


6.9 


32 
25 


3.38 








Comb . 


Treat. 
Control 


78.0 
77.8 


93.6 
86.9\. 


6.7 
St 


77 
42 


4.78 


.025 



III 


Treat . 
; Control 


76.9 

8^ 8' 


83.8 
84.4 


- .6 


18 
13 


.01 




IV 


Treat. 
Control 


79. i 
63.9 


80.3 
92.7 


-12.4 


34 
11 


4.53 




Comb* 


Treat . 
Control 


78.4 
74.1 


81.0 
89.3 - 


-8.3 


2'> 


3.43 





D 


III 


Treat . 
Control 


• 80.9 
81.9 


94.9 
81.3 


13.6 


30 
14 


10.46 


.01 


7 




?V 


Treat . 
Control 


82.4 
84.4 


92.8 
88.3 


4.5 


62 
45 


1.39 








Comb . 


Treat. 
Control 


81.8 
83.8 


93.5 
86.6 


6.9 


■ -;2 

59 


5.43 


.025 


» 



.AH 



III 


Treat . 


78.0 


91.5 


0 5.7 




Control - 


80.8 


85.8 




IV 


Treat . 


79.9 


- 88.8 


2.7 




Control 


80.4 


86.1' 




Conb . 


Treat.. 


79.1 


c 90.0 


4.1 




Control 


80.4 


85.9 





111 3.99 .025 
60 

148 1.52 
98 

259 5.40 .025 
158 



The standardized gain, comparison group analyses are summarized 
in Tables 31 and 32. At"" midtest tine the gain estimates are large 
and statistically significant for both the regular and alternative 
high school comparisons at Sites B and C. Across-site gain esti- 
mates derived from analyses involving these two comparison groups 
are also significant. The pattern is much the same at posttest tim** 
except for Site € where the gain estimates decrease in size and 
fai led to attain statistical significance. Overall , the results of 
these analyses tend to support those of the ANCOVAs. 

Table 31 

Treatment Group Raw Score Gains 
on the GDI Resources Scale at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretes t 
Mean 


Adi. Mid- 
test Mean 


Gain 


N 


F 


P 


A 


Treatment 
Reg. HS 


76.9 
77.3 


81.1 
78.7 


2.3 


26 
56 


.20 






Treatment 
Alt. HS 


76.9 
80.5 


82.8 
79.8 


3.0 


26 
49 


.43 






Treatment 
Dropout 


76.9 
^ 76.6 


80.6 
75.1 


5.5 


26 
18 


.47 




B 


Treatment 
Reg. HS 


78.8 
87. .3 


91.2 
82 . 5 


8.7 


80 
5J 


8.80 


,005 




Treatment 
Alt. HS 


78.8 
81.5 


90.9 
80.9 


10.0 


80 
52 


9.52 


.005 


C 


Treatment 
Reg. HS 


75.1 
84.4 


90.0 
83.3 


6.7 


47 
55 


3.16 


.05 




Treatment 
Ait. HS 


75.1 
85.5 


90.1 
76.1 


14.0 


47 
39 


9.64 


.005 


All 


Treatment 
Reg. HS 


77.4 
81.3 


88.6 
82.0 


6.5 


153 
164 


9.28 


.005 




Treatment 
Alt. HS 


77.4 
82.2 


88.8 
79.7 


9.1 


153 
140 


19.77 


7001 




Treatment 
Dropout 


76.9 
76.6 


80.6 
75.1 


5.5 


26 
18 


.47 





/ 



Table 32 
Treatment Group Raw Score Gains 
on the GDI Resources Scale at Posttest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretest 
Mean 


Adj. Post- 
test Me ail 


Gain 


N 


F 


P 


A 


Treatment 
Reg. HS 


74.1 
75.8 


85.3 
83.0 


2.3 


18 
39 


.16 


— 




Treatment 
Alt. HS 


74.1 
72.4 


83.0 
77.0 


6.0 


18 
28 


.75 


— 




Treatment 
Dropout 


74.1 
68.5 


81.0 
69.7 


11.3 


18 
16 


1.32 




' B 


Treatment 
Reg. HS 


78.1 
81.2 


97.5 
84.6 


12.9 


45 
41 


12.76 


.001 




Treatment 
Alt. HS 


78.1 
82.4 


97.3 
83.2 


14.1 


45 
25 


9.66 


.005 


C 


Treatment 
Reg. HS 


76.9 
86.4 


90.0 
86.8 


3.2 


18 
51 




— 




Treatment 
Alt. HS 


76.9 
84.5 


84.6 
78.3 


6.3 


18 
8 


.39 


— 


All 


Treatment 
Reg. HS 


77.0 
81.6 


93.2 
85.0 


8.2 


81 
131 


9.27 


.005 




Treatment 
Alt. HS 


77.0 
78. 1 


90.4 
80.9 


9.5 


81 
61 


6.54 


.01 




Dropout 


74.1 
68.5 


81.0 
69.7 


11.3 


18 
16 


1.32 





A^Ql^As, performed on scores from the GDI Information scale 
produced , no statistically significant gain estimates at midtest time 
(see Table 33). The posttest analyses (Table 34) are substantially 
more positive with 4 of 12 individual-site and all 3 across-site 
gain estimates attaining stati ^al significance. Standardized 
gain analyses increased most of the gain estimates, found statisti- 
cal significance in one case where the corresponding ANGOVA did not 
(site A, fourth cohort; gain » 3.0, F « 3.33, p < .05), and in- 
creased the significance level of two other estimates (Site A, com- 
bined(> and Site B, fourth cohort). 



65 



S5 



.ERIC 



H^i* * » ; 



Table 33 
Treatment Group Raw Score Gains 
on the GDI Information Scale at Midtest Time: 
Estimates Derived from Govariance Aralyses 



Site 


Goho) t 


Group 


Pretest 
Mean 


Adj. Mid- ' 
test Mean 


Gain 


N 


F 


P 


A 


11\ 


Treat . 
Gontrol 


12.2 
10.8 ^ 


13.1 
11.2 


1.9 


26 
13 


1.69 


— 




IV 


Treat . 
Gontrol 


9.9 
11.0 


13.3 
13.3 


— 


29 
26 


— 


— 




Gomb. 


Treat . 
Gontrol 


11.0 
10.9 


13.3 
12.5 


.7 


55 
39 


.71 


— 


B 


Ill 


Treat . 
Control 


12.8 
12.5 


14.3 , 
14.2 


.1 


82 
25 


.01 


— 




IV 


Treat. 
Gontrol 


12.4 
13.2 


14.1 
14.0 


.1 


40 
30 


,03 


— 




Gomb* 


, Treat. 
Gontrol 


12.7 
12.9 


14.2 
14.1 ' 


.2 


122 
55 


, .06 


— 


G 


III 


Treat . 
Gontrol 


13.4 
14.2 


13.7 
13.7 


— 


47 
29 


— 


— 




IV 


Treat . 
Control 


I. 2.8 

II. 5 


12.9 
12.8 


.1 


52 
10 


.00 


— 




Gomb. 


Treat . 
Gontrol 


13.1 
13.5 


13.3 
13.3 


— 


99 
39 


— 


— 


D 


Ill 


Treat. 
Gontrol 


13^7 
13.6 


15.0 
14.6 


.5 


47 

C-^- 15 


.19 


— 




IV 


Treat > 
GonLrol 


12.4 
-13.4 


13.2 
14.6 


-1.0 


73 
50 


4.21 


— 


' 


Gomb. 


Treat . 
Gontrol 


12.9 
13.5 


13.9 
14.7 


- .7 


120 

65 


1-71 


— 


All 


Ill 


Treat . 
Gontrol 


13.1 
13.0 


14.2 
13.6 


.6 


202 
82 


1 37 


— 






Treat* 
Gontrol 


12.1 . 
12.6 


-i-i.- 13.3 
14.0 


- .7 


194 
H6 


- 2.38 






Comb * 


Treat , 
Gontrol 


12.6 
12.8 


, lj.8 
; 13. S 


- .1 


396 
i98 


.14 






» 






66 


86 









V 



Table 34 
Treatment Group Raw Score Gains 
on the GDI Information Scale at Posttest Time: 
Estimates Derived from Govariance Analyses 



Site Cohort 



Pretest Adj. Post- 
Group Mean test Mean Gain 



A III 



Treat . 
Gontrol 



12.4 
11.6 



14.0 
11.9 



2.1 



18 
16 



3.04 



.05 



IV 



Treat. ' 
Gontrol 



8.1 
11.2 



13.4 
11.9 



1.5 



20 
17 



.98 



Comb . 



Treat . 
Control 



10.2 
11.4 



13.8 
11.9 \ 



1.9 



38 
33 



3.90 



.05 



B III 



Treat . 
Control 



13.5 
11.5 



16.2 
14.7 



1.5 



45 
19 



2„73 



IV 



Treat . 
Control 



12.3 
13.8 



15.0 
1-3.4 



1.6 



32 
25 



3.45 



,05 



Comb , 



Treat . 
Control 



13.0 
12.8 



15.7 
13.9 



1.8 



77 
44 



8.23 



,025 



C 


Ill 


Treat . 
Control 


12.5 
14.5 


I. 


15.3 
13.9 


1.4 


18 
13 


.94 






IV 


Treat. 
Control 


- 13.8 
9.0 


\ 


12.5 
12.2 


.3 


34 
11 


.03 






Comb . 


Treat. 
Control 


13.5 
12.0 




13.4 
13.1 


.3 


52 
24 


.12 




D 


Ill 


Treat. 
Control 


' ,14.8 
12.9 




15.4 y 
14.8 > 


.6 


30 
14 


.14 






IV 


Treat . 
Control 


12.6 
13.4 




13.4 
12'. 6 


.8 


62 
'♦5 


1.02 






Comb . 


Treat . 
Control 


13.3 
13.3 




14.0 ' 
13.1 


.9 


92 
59 


1.60 




All . 


Ill 


Treat. 
Control 


13.6 
12.5 




15.4 
14.0 


1.4 


111 
62 


4.84 


.025 




IV 


Treat . 
Control 


12.2 
12.6 




. 13.6 
12.5 


1.1 


148 
98 


4.64 


.025 




Comb, 


Treat . 
Control 


12.8 
12.6 




14.4 
13.1 


. 1.3 


259 
160 


10.27 


.005 



It appears that Sites A and B outperformed Si'ies C and D, but 
no convincing explanation for this finding occurs to the authors. 
The fourth-cohort gain estimare is smaller than that for the third 
cohort, thus continuing the pattern observed with the other two GDI 
scales* The difference here, however, is small and statistically 
non-significant • 

The standardized gain analyses presented in Tables 35 and 36 
closely parallel the corresponding MCOVAs. None of the resulting 
gain estimates is statistically significant at midtest time, but 
approximately half are significant at posttest time. Across sites, 
the gains at posttest time are also close in size to the estimates 
derived from the covariance analyses. ^ ^ 



Table 35 
Treatment Group Raw ^Score-Gains 
\ on the GDI Information Scale at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



t Pretest Adj. Mid- 



Site 


Group 


Mean 


test Mean 


Gain 


N 


F 


P 


A 


Treatment 
Reg. HS . 


12.2 
13*9 


14.6 
14.0 


.5 


26 
56 


.25 






Treatment 
Alt. HS 


12.2 
12.9 ^ 


13.8 
13.0 


.9 


• 26 
49 


.88 






Treatment 
Dropout 


12.2 

13,7 


.14.1 
13.5 


.6 


26 
18 


.17 





fi 


Treatment 
Reg. HS 


12.8 
14.0 


14.8 
13.9 


.9 


82 
58 


1.85 






Treatment 
Alt. HS 


12.8 
14.4 


15.0 
14.8 


.2 


82 
52 


.07 




C 


Treatment 
Reg. HS 


13.4 
15.8 


'14.8 
15.1 


- .3 


47 
55 


.11 






Treatment 
Alt. HS 


13.4 
15.8 


14.6 
16.0 


-1.4 


47 
39 


3.50 




All 


Treatment 
; Reg. HS 


12.9 
14 6 


14.8 
14.3 


.5 


155 
164 


1.05 






Treatment 
Alt. HS 


12.9 
14.3 


14.6 
14.6 . . 


.0 


155 
140 


.00 






Treatment 
Dropout 


12.2 
13.7 


14.1 
13.5 


.6 


26 
18 


.17 






• 




68 


• 88 









/ 

1 



^ Table 36 

Treatment Croup Raw Score Gains 
cn the GDI Information Scale at Posttest Time: 
Estimates Derived from Standardized Gain Analyses, Third Gohort 



Site 


Group 


Pretest 
Mean 


Ad j . Post- 
test Mean 


Gain 


N 


F 


P 


A 
fx 


Trpfltinpnt 

Reg. «S 


12.4 
14.9 


15.8 
16.4 


- .6 


18 
39 


.39 






xrc iLmcnL 
Alt HS 


12 4 
13.6 


14 9 
12.4 


2 5 


18 
28 


4.04 


.05 




Treatment 
Dropout 


12.4 
13.2 


14.7 
13.3 


1.4 


18 
16 


1.36 




■n 
D 


Xri atmenc 
Reg. HS 


14.1 


1 7 

15.9 


. o 


42 


1 OS 






Treatment 
Alt. HS 


13.5 
12.3 


' 16.1 
15.9 


.2 


45 
25 


.05 




C 


Treatment 
Reg. HS 


12.9 
15.7 


16.4 
14.1 


2.3 


18 
51 


4.71 


.025 




" " Treatment 
Alt. HS 


12.9 
16.1 


;16.0 
15.5 


.5 


18 
8 


.14 





All Treatment 13.2 

Reg. HS 15.0 

Treatment 13.2 

Alt. HS 13.4 

Treatment 12.4 

Dropout 13.2 



16.6 1.3 ■ 81 5.62 .01 
15.3 132 

15.7 1.2 81 3.11 .05 
14. 61 

14.7 1.4 18 1.36 

13.3 16 



69 



ERIC 



8d 



Self-Esteem Inventory 



Tables 37 and 38 present the raw score gains made by second- 
cohort CIP students on the Coopersmith Self-Esteem Inventory between 
pre- and midtesting and between pre- and posttesting respectively. 
At inidtest time, two of the individual-site as well as the across- 
site gain estimates on the Self-Esteem scale are statistically 
significant. At posttest time, however, there are no significant 
self-esteem gains. 



Table 37 

Treatment Group Pre-to-Midtest Raw Score Gains: 
Self-Esteem Inventory, Second Cohort 



Pretest Midtest 
Mean Mean ' Gain N t p 



Site A 

Self-Esteem . 35.1 39.0 3.9 21 1.64 

Openness 1.7 2.7 1.0 21 1.92 .05 

Site B 

Self-Esteem 33.7 o7.9 4.1 38 3.42 .005 

Openness 2,6 2.9 .2 38 .71 

Site C 

Self-Esteem 33.1 34.9 1.8 28 1.24 

Openness 2.2 ' 2.1 - .1 28 .11. 

Site D 

Self-Esteem 38.3 41.3 3.0 15 2.60 .025 

Openness 2.9 3.7 .8 15 1.29 



All Sites 

Self-Esteem 34.5 37.8 3.3 102 4.17 .001 
Openness 2.4 2.8 .4 102 1^% .025 



'JO 

70 



Table 38 

Treatment Group Pre-to-Posttest Raw Score Gains: 
Self-Esteem Inventory, Second Cohort 



Pretest Posttest 
Mean Me an Gain N t p 



Site A . 

Self-Esteem 34,2 38.7 4.6 18 1.54 

Openness 1.9 2.5 .6 18 1.54 

Site B 

Self-Esteem 31.1 34.8 3.7 14—.: 1.04 

Openness 2.1 3.2 1.1 1^' 2.11 .05 

Site C ^ 

<6elf-Esteem 33.1 31.8 -1.3 9 .48 

Opennfiss 2.8 3.7 .9 9 1.45 



Site D 

Self-Esteem 42.0 41.5 - .5 4 .20 

Openness 3.7 4.2 .5 4 .48 



All Sites 

Self-Esteem 33.7 36.4 2.7 45 1.55 

Openness 2.3 3.1 . .8 45 3.04 .005 

_ ^ - ^- 

The low midtest gain at Site C is clear^ly consistent with 
events at that site. Both implementation and climate were at their 
lowest point at the time the second cohort was midtested." The 
larger and statistically significant gains at ^ites B and D also 
make sense in terras of what was happening there. The Site A gain, 
because of its numerical value, se<>ms inconsistent with the status 
of itnplcTaentation there. It must be noted, however, that the gain 
estimate is not significantly different from zero, a fact that 
restores consistency between the gain and the site events. 

At posttest time it is somewhat surprising that the gain at 
Site D was not positive and significant. With a sample size of only 
four, however, such an expect at idh is unreasonable and the small 
negative gdin shown by those four individuals cannot^.be taken as any 
indication of program impact on self-esteem. The small sample size 
at Site B may also je responsible for the' lack of a statistically 
significant gain^ 

One individual-site .^and the across-site Openness gains were 
statistically s'dgnif icant , both at midtest and at posttest time. 
This finding, however, appears unrelated to any of the CIP objec- 
tives. It may represent no more than the result of repeated ex- 
posure to the Instrument. 

91 



Tables 39 and 40, which include self-esteem gain estimates and 
related statistics for third- and fourth-cohort CIP participants 
derived from covariance analyses, present an almost totally negative 
picture. Although the across-site estimate for third-cohort stu- 
dents at posttest time is significant at the .05 level, only 1 of 
the other 29 gain estimates was found to be reliably greater than 
zero. 

While these results are not very different from the raw score 
gains made by second-cohort CIP participants, it seemec^ that they 
might be somewhat deflated by a kind of John Henry effect. Since 
all control group students had been denied access to the program but 
were mid- and posttested at the CIP facility, it seemed not unlikely 
that they might distort self-reports in a positive way to cover up 
the deprivation they felt. With this possibility in mind, a deci- 
sion was made to examine the raw score gains made by members of the 
treatment and control groups* 

Across sites, the third-cohort treatment group gained 3.5 
points, a gain that would almost certainly have been significant 
with 111 degrees of freedom. The control group, on the other hand, 
gained 2.7 points. It is not clear whether that control group gain 
can be attributed to a John Henr/ effect or whether it stemmed from 
other causes . Some support for the fof^.er hypotbes is however, -is 
afforded by the fact that the regular and alternat ive high school 
comparison groups, which comprised students who had not been denied 
access to the program and wh6 were not tested at the CIP faciiicy, 
made smaller self-esteem gains than the third-cohort control group 
(1*4 raw score points in both instances). 

In any case, the control group gain enters into the covariance 
calculations and reduces both the size and the significance level of 
the ANCOVA gain estimate. At Site A, the situation is even worse. 
Although the treatment group gained 4.1 points, the control gi;oup 
gained 5.4. A similar, although less dramatic, pattern is seen in 
the fourth-cohort data. There the treatment group gained 2.3 poin/s 
while the control group gained 1.4. At Sit*e B the treatment group 
made a gain -of 4.1 points but it was largely offset by the 3.4 
points gained by the control group. 

One interesting finding that shows up in these analyses is that 
the fourth cohort made smaller gains than the third^cchort . If one 
assumes that improved self-esteem is at least partially a counseling 
outcome, then this finding is consistent with Lhe reduced amount of 
counseling available to fourth-cohort studeAts — a situation that 
apparently influenced other scores as well. 

Tables 41 and 42 summarize the results of the standardized gain 
analyses involving the three comparison groups* None of the gain 
estimates is significant at^ midtest time (Table 41) biit two 
individual-site and two across-site estimates are significant at the 
.05 level at posttest time (Table 42). It is also noteworthy that 
the two significant individual-site gain es^.imates occur at Site B, 



Table 39 
Treatment ^roup Raw Score Gains 
on the Self-EsteemvScale at Midtest Time: 
Estimates Derived from Covarianqe Analyses 



Site 


Cohort 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 




A 


III 


Treat . 
Control 


34.0 
■ 32.4 


33.5 
34.3 


- .8 


28 
11 


.08 







IV 


Treat, 
f'ontrol 


35.8 
35.2 


39.2 
38.0 


1.1 


29 
26 


.50 







Comb • 


Treat . 
Control 


34.9 
34.4 


36.5 
36.8 


- .4 


57 
37 


.06" 





B 


III 


Treat . 
Control 


36.5 
36.0 


38.3 
36.8 


1.5 


81 
23 


1.^34 







IV 


Treat. 
Control 


34.7 
35.3 


37.4 
37.2 


.1 


- 40 
30 


.01 







Comb . 


Treat . 
Control 


35.9 
35.6 


38.0 - 

37.1 . 


.8 


121 
53 


.91 





C 


Ill <y 


Treat. 
Control 


35.0 
36.2 


36.-6 
36'. 0 


.6 


28 


.16 







IV . 


Treat . 
Control 


34.2 
35.9 


36.6 
35.3 


1.3 


52 
10 


• 

.41 






Comb . 


Treat.^ 
Control 


34.6 
. 36.1 


36.6 
35.7 


' .9 


99 
38 ^ 


.65 




D 


Ill 


Treat. 
Control 


35.3 
37.5 


39.1 
37.2 


1.9 


48 
14 


1.82 







IV 


Treat. 
Control 


36.0 
37.7 


38.1 
37.8 




75 
49v 


.14 







Com' 


Treat . 
Control 


35.7 
' 37.7 


38. S 
37. e 


1.0 


123 
63 


1.'33 





All 


III 


Treat . 
Control 


35.5 
35.8 


37.4 
36.3 


1.1 • 


204 ■ 
76 


1.87 






IV 


Treat. 
Control 


35.2 
36.4 


37.7 . 
37.5 . 


.2 


196 
• 115 


.13 






Comb . 
— i. 


Treat. 
Control 


35.4- 
36.2 


37.6 
' 37.0 


,.6 


191 

>• 


1.18 





73 

P3 



T 



e / 



Table 40 
Treatmtent Group Raw Score Gains 
on the Self-Esteem Scale at Posttest Time: 
Estimates Derived from Covariance Analyses 



— \ 

site 


Cohort 


Group 


Pretest 
Mean 


Adj. Post- 

-test Mean Gain 


N 


F 


P 




III 


Treat . 
Control 


34.5 
^b.l 


37.8 
36.4 


1.4 


19 
16 


.39 






IV 


Treat. 
Control 


i 

36.1 

/35.7 


40.2 
37.5^ 


2.7 


20 
17 


1.38 






Comb . 


Treat . 
'Control 


^5.3 
3Vj1 


• 39.1 
37. Q • 


2.1 


39 
33 


1.83 




B 


Ill 


' Treat. 
Control 


36.9 
33.9 


40.2 ^ 
37.3. 


1.9 


45 
15 


2.63 


MM 




IV 


'i;reat . 
Control 


36.2 
^ 33.2 


39.5 
37.6 


1.9 


31 
25 


2.39 






Comb . 


Treat. „ 
Control 


36.6 
33.5 


39.8 
37.5 


2.3 


76 
40 


4.87 


.025 




Ill 


Treat. 
Control 


34\8 
36.5 


36.6 
36.3 


.3 

~ j' 
1.5 


18 
13 


.01 






IV 


Treat . - 
Control 


34.6 
35.7 


36.8 
35.3 - 


34 
11 


.64 






Comb . 


Treat . 
Control 


34.7 
36.1 


36.8 
■ 35.8 


1.0 


52 
24 


.46 





f > 




D 


III 


Treat. 


35.3 


39.9 


1.2 


30 


.30 




V- 

; 








Control 


37.8 


38.7 




14 












. IV 


Treat . 


.35.8 


37.5 


- .3 


62 


.05 












Giontrol [ 


38.2 


^'37.8 




44 


J 












I 


35.7 


/ ^ 




,§2-' 












Treat, i 


38.2 


.1 














Control 


38.-1. 


38.1^ 




58^ 





























All 



III ^- 


Treat. ^ 


35.7 • 


39 .0 




Control 


34.3 ■ 


• 37.3 


. ly 


Treat . 


35r7 


38.1 




Control 


36.2 - 


37 .-i 


Comb . 


Treat. - 


35.7 


38.5 




Control 


35.5 , 


37.4 



1.7 



112 
58, 

147 



2.^1 



.59 



' 1.1 fS259 (^2. 



7 155 



(^2.53 



.05 



74 



94 



Table 41 
Treatment Group Raw Score Gains 
^on the Self-Esteem Scale at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 







Pretest 


Adj. Mid- 










Site 


Group 


Mean 


test Mean 


Gain 




r 


P 


A 


Treatment 


34.0 


34.6 


- .1 


28 


.00 


— 




Reg. HS 


35.2 


34 . 7 


* 










Treatment 


34.0 


33.2 


1.3 


• 28 


.54 


— 




Alt. HS 


32.9 


34.5 




43 








Treatment 


34.0 


J4.6 


- .6 


28 


.08 






Dropout 


36.4 


35.2 




19 






B 


Treatment 


36.5 


37.4 


1.6 


81 


2.19 






Reg. HS 


34.0 


38.8 




"52 








Treatment 


36.5 


39.0 


1.8 


81 


2.62 






Alt. HS 


38.1 1 


37.1 




55 







C. 


Tre-atfment 
Reg. HS 


35.0 i 
36.8 ' 


37.5 
36. § 


1.0 


47 
54 


.61 




Treati^ent 
j Alt. HS 
-4 ' 


35.0 
35.8 J 


36:>8 
35.6 


1.2 


47 
39 


.89 


AIL 


Treatment 
Reg. HS- 


35 .6 ' • 
35.4 


36.8 
35.8 


.9 


156 
161 


1.57 




Treatment 
' Alt.'HS 


35.6 \ 
35.8 i. 


37\0 
36.1 


.9 


156 
/ 137 


1.25 




Treatment 
Dropout 


f 

34.0 ' 
36.4 


'34.6 
35.2 


- .6 




.08 



^the only site where impleirjenCation was nearly ideal throughout the 
entire year between pre- ar^d posttesting of third-cohort studeltts. 

a The stan<|axdized gain janalyses produced sullditant ial ly mo^e pos- 
itive results' than the ANC()VAs. As suggested earlier, this differ- 
ence tends to supjJort the hypothesis that control group students may 
have biased their reports 

Xied entry into the program. It seems 
possibility , that the standardized gain 
.d estimates of program impact on self- 



because they had been dc\ 
likely^ in view of this 
analyses provide more val 
est^eem than the covarian^ jani|lyses. 



Table 42 
Treatment Group Raw Score Gains 
on the Self-Esteem Scale at Posttest Time: 
Estimates Derived from Standardized Gain Analy^^es, Third Cohort 



Site 


Group 


Pretest 
Mean 


Adj. Post- 
test'Mean 


Gain 

* 


N 


F_ , 


P 


A 


Treatment 
Reg. HS 


34.5 

a3.v 


38.0 
37.0- 


, 1.0 


19 
39' 


.23 


\ 




Treatment 
Alt* HS 


34,5 
31.6 


■ 3r.2 • 

34.5 


2.6 

i 


19 
28 


1.41 







Treatment 
Dropout 


34.5 
34.4 


38.6 - 

. 37.2 ' 

\ 


■'1.4 


19 
16 


.42 




B 


Treatment ' 
Reg. HS 

1 i • 

Treatme^ny 
Alt. HS, 1 


36.9 
34.0 


] 

39.1 ' 
36.6 


2.5 


45 
41 


3,. 18 


.05 




36.9 ■ 
35.7 


40.0 
36.6; 


3.4 


45 
26 


3.41 


.05 


C 


Treatineitt 
Reg. HS 


34.8 
36.9 


38.2 
36.3 


1:8 


18 
50 


.75 






Treatment 
&Alt. HS 


34.8 
35.2 


36 .'5 
, 37.8 


-1.3 ■ 


18 
8 


.18 





treatment 


35.9' 


38.5 


1.8 


82 


3.22 lo5 


Reg, HS 


34.9 


36.7 




130 


1 


Treatment 


35.9 


38.5- ■ 


z-.o 


82 


3.19 .-0^ 


Alt. HS; 

,*» 


34.3 


36.5 




6? 




Treatment 


34.5 


• 38.6 


1.4 


19 


.42 


Dropout 


34.4 • 


37.2 ' 


> 


16 





Tables 43 through 4^ summarize th6 results of the covariance 
and standardized gain analyses . perforated on Coopersmith Openness 
scores . None of the across-sit e analyses shows a significant gain 
estimate and only 5 of the 50. individual-site estimates are st a- 
tistically significant. (Two-tailed tests were used in these 
analyses as therei was no reason to predict that the program^ treat- 
ment would either raise or lower scores on this scale.) 

The ^nonsigz^if icance -and apparent irrelevance of these gain 
estimates to progrfim -goals suggests that no further attempts at 
interpretation be made. ' V 






^ Table 43 
s Treatment Group Raw Score Gains 

on the Openness Scale at Midtest Time: 
Estimates Derived from Covariance Analyses 






i 








Site Cohort 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


I 


? 








A III 


Treat. • 
Control 


2.6 
3.2 


3.8 
2.9 


.9 


28 
12 


,2.35 









. — 




Treat. 
Control 


2.9 
2.2 


3.2 
2.5 


.7 


29 
26 


2.52 











Comb* 


Treat. 
Control 


2.7 
2.5 


3.5 
2.7 


.8 


57 
38 


5.29 


.025* 









B III 


Treat . 
Control 


2.8 
3.4 


2.4 
3.3 


- .8 


81 
22 


4.84 


.05 * 








IV 


Treat. 
Control 


2.9 
2.6 


2.6 
2.6 


.1 


40 

30 


.01 


— 








Comb. 


Treat. 
Control 


2.8 
2.9 


2.5 

2:9 


- .4 


121 
52 


2.07 


— 






















C III 


Treat, 
tfontrol 


2.1 
2.8 


2.9 

, 3.1 


- .2 


47 
28 


.41 


, — 


< 






IV 


Treat. 
Control 


2.7 
3.4 


3.2 
2.8 


.4 


52 
10 


.48 


— 






i 




X I. 6 CtL. . 

Control 


2.4 
3.0 


3.0 
3.1 


- .1 


99 
38 


.04 


mmmm 






i 


D III 


Treat. 
Control 


2.6 
2.1 


2.8 
2.3 


.5 


48 
14 


1.02 











IV 


Treat . 
Control 


2.9*» 
2.8 


2.4 
2.9 


- .5 


75 
49 


2.83 














2.8 
2.6 


2.6 
2.8 


- .2 


123 
63 


.59 










Comb . 


Treat. 
Control 








\ 


All III 


Treat. 
Control 


2.6 
2.9 


2.8 
3.0 


- .2 


204 
76 


1.08 









] 


IV . 


' Treat. 
Control 


2.8 
2.6 


2.8 
2.7 


.0 


196 
115 


.01 




* 




r 


Comb. 


Treat. 
Control 


2.7 
2.7 


2.8 
■* 2.8 


- .1 


400 
191 


.19 








r o 


*Two-tailed probability 




77 

• 


C 













Table 44 
Treatment Group Raw Score Gains 
on the Openness Scale at Posttest Time: 
Estimates Derived from Covariance Analyses 







Pretest 


Adj. Post- 










Site Cohort 


Group 


Mean 


test Mean 


Gain 


N 


F 


P 


A III 


Treat . 


3.2 


3.0 


1.3 


19 


^.63 


.05 




Control 


2.2 


1.7 




15 




IV 


Treat . 


3 2 


2.9 


- .3 


20 


.16 


— 




Control 




3.2 




17 


" 




Comb , 


Treat. 
Control 


3.2 
2.0 


3.0 . 


.6 


39 
32 


1.29 


— 


B III 


Treat. 
Control 


2.7 


2.8 
2.7 


.1 


45 
15 


.05 


— - 


IV 


Treat . 


. 2.9 


3.3 


.9 


31 


4.51 


.025 





. Control.. 


2.6 


2.4 




25 


- 




Comb* 


Treat. 
Control 


2.7 
2.6 


3.0 
2.5 


•5 


76 
40 


2.72 




















" C III 


Treat. 
Control 


1.8 
2.5 


2.5 
2.8 


- .3 


18 
13 


.20 




IV 


Treat. 


2.9 


3.2 ■ 


- .6 


34 


1.06 







Control 


3.6 


r 3.8 




11 


t 




Comb • 


Treat. 
Control 


2.5 
3.0 


3.0 
3.2 


- .2 


52 
24 


.49 





D III 


Treat. 


2.9 


3.2 


.6 


30 


•91 






Control 


2.1 


2.6 

9 




12 






IV 


Treat. 


2.9 


• 2.9 


« .2 


62 


.38 






Control 


2.9 


3.1 




44 






Comb. 


Treat. 


2.9 


3.0 




92 


— 







Control 


2.8 


3. Ox 




56 






■ T ' 

All III 


Treat. 
Control 


2.6 ' 
2.4 


2.9 
2.4 


.5 


112 
55 


2.57 




IV 


Treat. 
Control 


2.9 
2.7 


3.1 
3.0 


.1 


147 
97 


.04 




Comb * 


Treat. 
Control 


2.8 
2.6 


3.0 
2.8 


.2 


259 
152 


1.18 





78 

. 98 



Table 45 
Treatment Group Raw Score Gains 
on the Openness Scale at Midtest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Site 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 


P 




• A 


Treatment 
Reg. HS 


2.6 
2.3 


3.6 
2.6 


.9 

i 


28 
55 


5.65 


.05 






Treatment 
Alt. HS 


2.6 
2.6 


3.8 

\ 3.8 


- .1 


28 
43 


.01 


— 






Treatment 
Dropout 


2.6 
2.6 


3.8 
3.8 


.0 


28 

19 


.00 


— 




] B 
r > 


Treatment 
Reg. HS 


2.8 
2.4 


2.2 
2.8 


- .6 


•81 
52 


3.49 ' 


t 






Treatment 
Alt. HS 


2.8 0 
2.6 


2.3 
- 2.4 




81 

55 


.01 





• 


C 


Treatment 
Reg. HS 


2.1 

2.4 • 


2.9 
2.6 


.3 


47 
54 


.73 





— 


,^ 


Treatment 
Alt. HS 


2.1 
1.9 


2.6 
2.0 


.6 


47 
39 


3.52 







All 


Treatment 
Reg. HS 


2.5 
2.4 


2.6 
2.7 


- .1 


156 
161 


.09 




» 




Treatment 
Alt. HS 


2.5 
2.4 


2.6- 
2.8 


- .1 


156 
137 


.26 








Treatment 
Dropout 


2.6 
2.6 


3.8 
3.8 


.0 


28 

f9 


.00 







79 



A. 



Table 46 
Treatment Group Raw Score Gains 
on the Openness Scale at Posttest Time: 
Estimates Derived from Standardized Gain Analyses, Third Cohort 



Pretest ' Adj. Post- 



Site 


Group 


Mean 


test Mean 


Gain ' 


N 


F 


P 


A 


Treatment 


3.2 


2.8 


.4 


19 


.47 






Reg. HS 


2.5 


2.4 




39 








Treatment 


3.2 


2.4 


- .3 


19 


• .36 






Alt. HS 


2.2 


2.7 




18 








Treatment 


3.2 


2.9 


.8. 


19 


1.73 






Dropout 


2.6 


2.1 




16 







Treatment 
Reg. HS 

Treatment 
^ Alt. HS 



2.6 
2.5 

2.6 
2.3 



2.8 
2.5 

2.8 
2.5 



.3 



45 
41 

45 
26 



-Treatment... 
Reg. Ha 



1.5 



-2^.-2-- 
2.6 



18 
50 



.06 



.42 



.25 



Treatment 1.8 ' 2. .8 .3 18 .81 — 

Alt. HS 2.3 2.5 8 



All , 


* 1 

Treatment 
Reg. HS 


2.6 
2.3 


2.6 
2.6 




82 
130 




Treatment 
Alt. HS 


2.6 
2.4 


2.7 
2.4 


.3 


82 .81 — 
62 




Treatment 
Dropout 


3.2 

2.6 ' 


2.9 
2.1 


.8 


19 1.73 — 
16 



'J 



InCsrnal-External Scale 

The results of pre^to-midtest and pre-to-posttest raw ^core 
gain analyses for second-cohort CIP participants are summarized, 
respectively, in Tables 47 and 48. None of the pre-to-midtest gains 
and only one of the pre-to-posttest gain is statistic^illy signifi- 
cant. The sample sizes for the individual-site, pre-to-posttest 
analyses are all quite small and account, in larg6 measure, for the 
negative results. The larger sample size for the across-site gain 
was responsible for the significant t^. 



Table 47 

Treatment Group Pre-to-Midtest Raw Score Gains: 
Internal-External Scale, Second Cohort 

^ Pretest 'Midtest 

Mean Mean Gain ^ v P 



Site A 


15.8 


17.0 


1.3 

1 


. 22 


1.30 


Site B 


15.8 


15.8 


.0 


40 


.04 


Site._C . 


15.4- 




-1.3 


26 


1.72 


Site D 


15.0 


14.9 


- .1 


15 


.19 


All Sites 


" 15.6 


15.5 


- .1 


1<03 


.19 



\ Table 48 

Tr-eatment Group-Pfe-t'O-Posttest Raw , Score Gains: 
^ Internal-External Scale, Secpnji Cohort 





Pretest 
Mean 


Posttest 
Mean 


Gain 


N 






\ 

Siite A 


15.9 




1.9 


18 


1.69 




Site B 


15.0 


16.5 


1.5 


15 


1.02- 




Site C 


13.2 


14.1 


.9 


9 


.57 




Site D 


' 17.0 


18.6 


1.6 


5 


.83 





All Sites 15.2 16.8 1,6 47 2.18 .025 



. 81 

101 



Tables 49 and 50 summarize che results of the ANCOVAs. Only 
one of the individual-site gain estimates is statistically sig- 
nificant and none 'of the across-site analyses shpws a significant F. 
It is hypothesized that the same forces might be operating here "as 
appeared to operate in^ the case of the Self-Esteem scale — in other 
words, that' members of the control group might deliberately distort 
their responses in order to appear in a more favorable light. The 
data, however, did not offer strong support for this hypothesis.* 



In terms of raw scores, the third-cohort treatment group shows 
a pre-to-posttest gain of I A which is statistically significant 
(_t = 2.87, df « 112, p < .01); The control group has a gain of .6 
raw score points , which Is nonsignificant but large enough td pre- 
vent the ANCOVA from *sho\/ing a significant gain. Had data from the 
fourth cohort presented a similar picture, a plausible case could 
have been made for biased self-reporting. In fact, however, the 
fourth-cohort control group's mean posttest score is lower than its 
pretest score (although not significantly). While the gain made by 
the treatment group is only .2 raw score points, the control group's 
performance served to inflate the ANCOVA estimate yielding a value 
of .4 points. This fiuding appeared to negate the John Henry 
hypothesis. 



lables 51 ^-and-^ l-e ummar-i-ze— t-he -r esultsn[jf~T h'6'*"o rajfda rH'rz e2^ 

analyses. Altho.ugh one individual-site and one across-site gain 
estimate are significant at posttest time, the picture suggests that 
the CIj.' does not strongly or consistently affect locus of, control. 
If there is any effect, it is slow to develop. None of the gains 
from any of the analyses is significant at midtest, time. Neither 
are any of the fourth-cohort gains significant after tiine months 
(the pre-to-posttest interval for that cohort). v. 



c 



B2 



102 



Table 49 
Treatment Group Raw Score Gains 
on the Internal-External Scale at Midtest Time: 
Estimates Derived from Covariance Analyses 



Site 


Cohort 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean 


Gain 


N 


F 


P 


A 


III 


Treat. 
Control 


16.6 
14.8 


15.4 
17.7 


-?.4 


28 
12' 


9.25 


— 




IV 


Threat. 
Control 


15.2 
15.5 


16.4 
15.5 


.9 


28 
26 


1.25 


— 




Comb. 


Treat. 
Control 


15^9 
15.2 


15.9 
16.2 


- .3 


56 
.38 


.33 


— 


B 


Ill 


Treat. 
Control 


15.8 
16.2 


16.1 
15.9 


.2 


80 

24 


.07 


— 




IV 


Treat. 
CftatXQl 


15.1 

15^5- . 


15.6 

L5.A 


.1 


40* 

28— 


.02 


— 




Comb 


Treat. 
Control 


15.6 
15.8 


15.9 
15.7 


.2 


120 
52 


.17 


— 


C 


I III 


Treat. 
Control 


15.5 
15.6 


16.4 
16.8 


- .4 


46 ' 
28 


.38 


— 




IV 


Treat. 
Cor»rol 


14.8 ' 
17.5 


14.5 
15.1 


- .6 


52 
10 


.27 


— 




Comb. 


Treat. 
Control 


15.2 
16.1 


15.4 
■ 16.3 


- .9 


98 
38 


2.30 




D 


Ill 


Treat. 
Control 


15.8 
16.1 


16.8 
16.5 


.3 


46 
15 


.11 


— 




IV 


Treat. 
Control 


15.2 
16.1 


»15.7 
16.4 


- .6 


65 
45 


.69 


— 




■ Comb . 


Treat. 
Control 


15.4 
16.1 


16.2 
16.4 


- .2 


111 
60 


.21 


— 


All 


Ill 


Treat. 
Control 


15.9 
15.8 


16.2 
16.7 


' - .5 


200 
79 


1.48 






IV 


Treat. 
Control 


15.1 
15.9 


15.5 
15.8 


- .3 


185 
109 


.31 






Comb. 


Treat . 
Control 


15.5 
15.8 


15.8 . 
16.2 


- .3 


385 
188 


1.42 





83 



Table 50 
Treatment Group Raw Score Gains 
on the Internal-External Scale at Posttest Time: 
Estimates Derived from Covariance Analyses 



- 

Site Cohort 


Group 


Pretest 
Mean 


Adj. Post- 
test Mean 


Gain 


N 


A III 


Treat . 


17.0 


16.8 


.9 


19 




Control 


14.4 


15.9 




16 


. IV 


Treat. 


15.0 


, 16.3" 


.1 


20 




Control 


14.9 


16.2 




17 


Comb. 


Treat. 


.15.9 


16.6 


.6 


39 




Control 


14.6 


16.0 




33 



.79 
.01 
.54 



B - III 


Treat. 
Control 


15.5 
15.5 


16.8 
15.1 


1.7 


46 
16 

~l 


3.20 .05 


iV 


Treat . 


15.1 


15.3 


- .6 


31 


.45 




Control 


15.3 


15.9 




25 




Comb. 


Treat . 


15.4 


16.2 


.6 


77 


1.07 




Control 


15.4 


15.6 




41 





C 


III 


Treat. 
Control 


15.9 
15.8 


17.2 
17.7 


- .5 

0 


18 
13 


.23 






IV 


Treat. 
Control 


14.9 
, 16.2 


15.7 
14.6 


1.1 


34 
11 


1.03 






Comb. ' 


Treat " 
Control 


15.2 
16.0 


16.2 
16.4 


- .2 


52 
24 


.05" 




D 


III 


Treat. 
Control 


15.6 
15.4 


- 16.4 
15.9 


.5 


30 
•14 


.16 






IV 


Treat. 
Control 


15.1 
16.6 


14.9 
14.8 


.1 


- -5-4- 
45 


.02 






Comb. 


Treat. 
Control 


15.3 
16.3 


■ 15.3 
15.1 


.2 


84 
59 


.10 





AH III 



IV 



Comb. 



/Treat. 
^ Control 

\ 

Treac. 
dontrol 

Treat. 
— Contxol 



15t8 


16.8 


.8 


113 


2.37 




15.2 


16.0 




59 






15.0 


15.4 




139 






'15.9 


15,4 . 




98 






15.4 


16.0 


.4 


252 


.83 




15.7 . 


15.6 




- 157 







84 



^ Table 51 , ' 
Treatment Group Raw Score Gains 
on the Internal-External Scale at Midtest Time: * 
Estimates Derived from Standardised Gain Analyses, Third Cohort 





\ Site 


Group 


Pretest 
Mean 


Adj. Mid- 
test Mean ' 


Gain 


N 


•F p 




A 

/ 


Treatment 
Reg. HS 


16.6 
16.5 


15.4- 
16.2 


- .3 


28 
55 


1.41 


— 


1 


Tireatment 
Alt. HS 


16.6 
14.6 


14.3 
16.2 , 


-1.9 


28 
48 


5.22 






Treatment 
Dropout ^ 


16.6 
16.3 , 


15.3 
16.2 


7 .9 


28 
17 


.46 




B 


Treatment 
Reg. HS 


15.8 
15.9 


16.1 
15.6 


.4 


- 80 

52.. 


.34 




- 


Treatment 
Alt. HS 


15.8. 
16.7 


16.5 
16.4 


.1 • 


^80 
51 


.01 . ~ 






Treatment 
Aeg. HS 


15.5 
15.4 


16.4 
16.4 


.0 


46 
53 








Treatment 
Alt. HS 


15.5 
16.0 


16.7 
16.2 


.4 


46 
38 


. 14 

f 




All 


Treatment 
Reg. HS 


15.9 
16.0 


16.1 
16.0 


.0 


154 
160 


.01 

, \ 




1 


Treatment 
Alt. HS 


1579 
15.8 


16.0 
16.4 


- .4 


154 
137 


■ 1.03 






Treatment 
Dropout 


16.6 
16.3 


15.3 

16^2 


- .9 


28 
17 


.46 




Table 52 

c ' Treatment Group Raw Sc.pre Gains 
« on the Internal-External Scale at Post test Time: 
Estimates Derived from Standardized Cain Analyses, Third Cohort 



^ — 

SjLte 


Group 


Pretest 
Mean. 


Adj. Post-; 
test Mean 


P 

Gain 




F 


P 


A 


Treatment 
Reg. HS • - 


17.0 
16.6 • 


17.2 
16.0 


1.2 


— c 

19 
39 


1.92 




r 


Trea^tment 
Alt. HS 


17.0 
13.0 


15.7 
17.1 


-1.4 


19~ 
27 


2.05 






Treatment 
Dropout 


17 .0 
17.8 


17.8 
16.1 " 


1 . 7 


19 


2.45 




B 


-Treatment — " 
Reg. HS 


~1575 
16 A 


17.2 
16.1 


~ 1'.2 


\41 


2.16 






Treatment 
Alt. HS 


15.5 
16.5 


l'Z;.2 

14.7 


2.5 


46 

i \ 23 


5.28 


.025 


C 


Treatment 
Rfeg. HS 


15. 

1.5,5 


16.9 
15.0 


- 1.8 


\ 


2.47 ' 

9 


— 




Treatment 
Alt. HS 


. 15.9 
16.9 


17.5 
17.9 




\ 


\ 




All* 


Treatment 
Reg. HS 


15.9 
16.1 


17.2 ' 
15.6 


1.6 


83 
130 


8.63 


.005 




Treatment 
Alt. HS 


15.9 
14.9 


16.7° 
16.6 


.1 


83 
58 


.01 






Treatment 
Dropout 


17.0 
17.8 


17.8 
16.1 


-1.7 


19' 
14 


2.45 




/ 


; 




% 




0 




*-> 



Follov-Up Outcomes 



'All of the analyses performed on the f:ollow*-up data involve 
contracts between the treated and the untreated port;ions of the 
treatment group* * in those situations where control groups were 
available, their data were' contrasted with those of the total treat- ^ 
meht group as well as with those of the treated subgroup* Compar- 
isons between control and ^treatment groups are less subject to bi.'is 
resulting from (possibly) differential attrition And are therefore 
more credible* On the other hand, because members of the, untreated 
subgrpup received little or no treatment, the size of the treatment 
effect is necessarily diminished* Comparisons between the control 
group and the treated subgroup can be exp^cted^ to show larger difr 
^erenc.eL8,^^_i)ut— the -possibility that these differences result from 
seli-selection rather than from "the treatment is also more plau- 
sible* . , . 

All of the follow-up data^were analyzed using Chi Square 
techniques* )!ost of them i,nvolved 2x2 tables where, for example, 
the ^numbers of employed and unemployed yoaths from tre^'ment and. 
CQntrol groups were tallied and compared* / 

\ Table 53 presents, by site, xohort, and group, the numbers of 
'Students about whom it was possible. to obtain some information* For 
the treatment and control groups Xbut not for the' treated and uh~ 
treated subgroups), these numbers ave also expre^^ed as i^rcentages 
of\the corresponding total groups, preteste(** As /can be seen, it was 
possible' to obtain a much higher percentage of follow-'up returns 
than either mid- or\posttest scores* ^ Overall, the first follow-up 
return percentage vas^3% for the combined treatment groups, and 76% 
for the cotnbined contror'-groups; The corresponding figures for the 
second follow-up were 76% ^nd. 72%* 

Site B Had the highest return rate for' bot-h follow-ups while 
Site C had the lowest for the first follow-up but was tied with 
Sites A and D for the second* These individual-site return rates 
are thought to reflect both the difficulty in locating the students 
(due to their mobility,.^ror example) and the resourcefulness and 
zeal of the^ site assistants* Unfortunately, it is not possible to 
separate out the relative 'contribution of these influences* 

It should be pointed out that not all of the follot^up data are 
highly reliable* Where direct contact with the students in question 
proved impossible, we attempted to gain information from friends, 
relatives, school recQrds, and other sources* Occasionally, dif- 
ferent sources would yield contradictory information^abt^ut a single 
individual* One .QIP intern, for example, was reportedf as dropped 
out and unemployed by a relative when, in fact, he had graduated 
from the CIP and vas enrolled as a fjull-^me student in college* We 
sorted out such conflicting stori/^s as carefully as we could, but 
^l^vSorad errors almost certainly remain in the data«^ 

87 

, ■ 107 ' ' 



Table 53 

Return Rates for the First and Second 
Follow-Ups by Site^ Cohort,* and Group' 



Cdjiort 


•V 

Group 


Site 






R 
o 


ox Lc 


c 


, oxue 


u 


Ist 


2ni 


1st 


2nd 


ist 


2nd 


•1st 


2nd 


TT 










AO 






JLJL 


OA 

JL\i 




^ Untreated 


20 


18 


"12 < 




0 


3 


18 


17 




Total 


.49 , 


48 


64 


62 


5 25 


37 


40 


•37 












* 




(76%) 


(60%) 




T T T 
ill ^ 


ireacea 


J J 






o9 


JO 


59 


67 


77 




Untreated 


34 


29 


•3 


^ 12 


15 


27 


14 


21 ^ 




; Total 


87 


79 


105 


101 


' 45 


86 


81 


98 










V.O/*; 






(72%) 


(69%) 


(82%) 


• 


Control 


41 


41 


49 


44 ' 


'27 


38 


31 


38 






(75%) 


(75%) 


(to) 


(73%) 


(50%) 


(70%)»(56%) 


<69%) 


IV 


^'^ Treat e;d 


52 




51 




51 




95 






Untreated 


47 




16 




8 




22 






Total 


99 




- 67 




59 




117 








(98%) 




(89%) 


• 


(89%) 




(66%) 






- Control 


46(^ 




' ^ 
58 








5,5 








^ (84i) 




(78%) 








(90%) 





All • Treatment .235 _ 127 236 163 .129 123 238 135 

, • (8^;?);,^73%) (87%)- (83%)' (53%) (73%) (^66%) (73%) 

"AU Control/ 87 ' 41 -.407 ' ^4 49 38 126 38 

\' ■ • /• ; f^?> (75%)* (80%) (73%) (59%) (70%) (7%jf) (69%) 



tebl^is 5*4*. and- 55 present statistics relevant to -the high 
school s'cat^is of ^ecpnd-cohoft- CIP interns'. Across sites, at the 
time of t^e ^Hrst follow-up, two-thirds of the treated group have 
graduated froms^high -school, were currofttly enrolled, or^had re- 
ceived a 'GED,* while two-thirds ■ of the untreated group have dropped 
out prior to graduation and have not received, a GED. (Th.ere were no 
control grou^i fqr the' second cohort.). At the time of the second 
folloV-up, the, resulc^s are only sligLtly less dramatic with two- 
thirds- falling to 63% \in the case of . th^ ^treated group and to 61% in 
the t:a»e of^ the untr^atjp^ group. The overall resul^ts of both 
follow-ups are highly H^ificant {p < .(1025 in 'both cases). 



\ 



Table 54 

High School Stat^ufi of Treated and Untreated 
Grouo Members: First Follow-Up, Second Cohort 



M* 

J 


Site r 


4 

Group 


X Or ad., GED, 
or EnrolleLd 


% Dropped 
Out 


Sample 
Size 


h * ' * 


A 


Treated 
Untreated 


69Z 

50% 


31% 
50Z 


29 
20 


f 


B 


Treated 
Untreated 


75Z 
50Z 


25% 
50% 


52 
12 




C 


Treated 
Untreated 


52Z 


48% 


25 
0 




D 


Treated 
Untreated 


64% 

6% , ^ 


36% \ 
94% ' 


22 

\ 18 

\ 


All 


Treated 
Untreated 


67% 
34% 


33% 
66% 


12§ 
50 


t 

; 


* - ■ 




Table 55 






? 




High School 
Group Members 


Status of Treated and Untreated 
: Second Follow-Up, Second Cohort 


f •■ . 

*. - V # 


-Site 


Group 


Z Grad. , GED, 
or Enrolled 


% Dropped 
Out 


Sample 
Size 

-4 



Treated 60% '40% ^ 30 

Untreated' 56% - 44% 18 



A 



B • . Trea*#4 71% . " 29% 49 

Unt'^ted 54% , ; 46% ■ 13 

f- 

C Treated 56% 44% . ' 34 

Untreated '• 33% . , 67% 3 

b. Treated 60% - 40% 20 

Untreated 12(1 » 88% 17 

•» . ^ 

All ^Treated 63% ' 37^'^ 133 

Untreated 39% 61% ^-51, 



7 ^ T 

The individual-site findings are most dramatic at Site D where 
the data suggest that very few of those who did not enroll in the 
CIP, or who propped out shortly after enrollment, returned to school 
or entered GEET programs. A partial explanation for this face is 
that all of the second-cohort interns at bite p had previously 
dropped out of school. Apparently, their disenchantment with "the 
system" continued. 

For the treated group, the results were most favorable at Site 
B, a finding that is consistent with the state of implementation at 
that site at that time. The individual-site Chi Squares (both first 
and second follow-ups) were only significant at Site D, however, and 
primarily because of the high dropout rate in the untreated .group. 

Tables 56 and 57 present data on the high school status of 
third-cohort treated, untreated, and control group members. Results 
from the first follow-up look much 1 ike the corresponding second- 
cohort findings as far as the treated and untrt:nted subgroups are- 
concerned . Across sites approximately 60% of the* treated subgroup 
members have graduated from high school, are currently enrolled, or 
have earned a GED. Only 40% of the untreated subgroup fall into 
this category. The control group percentages are approxitpately half 
way between those of the treated and untreated subgroups. The 
treated and untreated^ subgroups are signi f icant ly different (Chi 
Square = 10.00, p < .01) but neither the treatment group nor the 
treated subgroup is significantly different from the control group. 
At Site C, however, the treatment group is significantly superior to 
the control group (Chi Square = 4.18, p < .05). 

At the time of the second follow-up* the treated and untreated 
subgroups remain significantly different (Chi Square = 3.9, p < 
.05), but the difference is somewhat smaller than at the time of the 
first follow-up. The results at Site C continue to favor the 
treatment over the control group (Chi Square = 4.14. p < .05). 

At Site A, the control group has a larger percentage of stu- 
dents who have graduated from high school, are currently enrolled, 
or have obtained a GED than any of the other groups at any of the 
other sites. Thi^^uuexpected finding may reflect the fact that Site 
A had some 20 other alternative programs readily available to stu- 
dents who were having difficulty in high school. In any case, it 
his an important effect on the overall results. When Site A data 
are removed, the composite treated subgroup (from the other three 
sites) has a significantly better high school performance record 
than the control group (Chi Square = 4.23, p < .05).^ ^ 



90 



110 









Table 56 


• 


• 




-> 




High School Status of treated, Untreated, and Control 
Group Members: First Follow-*Up, Third Cohort 






site 


Group 


or Enrolled 


Out 


k^CUU M X c 

size 




'i 
\ 


A 


Treated 

fin Y r A a \r 

Control 


57% 
38% 
61% 


43% 
62% 
39% 


53 
34 
41 




(V 
i. 

V 


B 


Treated 

Untreated 

Control 


64% 
^2% 
55% 


36% e 

38% 

45% 


91 
13 
49 






C 


Treated 

Untreated 

uonLroi 


57% 
47% 
982 


- 43% 
53% 
72% 


30 
15 
25 


- - — 




D 


Treated 
Untreated 
Control ' 


58% 
14% 

53% 


42% 
86% 
47% 


67 
14 
30 






All 


Treated 

Untreated 

Control 


60% 
39% 
52% 


40% 
61% 
487, 


241 
76 
145 


— 


f 












. — - 

















fr 
J' ' - 
S. ' ' 












*, 


^/ 














?. 














V" 




c 










j't* 
*> *''» 1- ' 
. ' 


'A » 




91 - 


111 




i 


1^ 













I 












• 










Table 57 






< 






High School Status 
Group Members: 


of Treated, Untreated, ^d Control 
Seccnd FoUow-Up, .Third Cohort 


- 




Site 


Group 


% Grad., GED, % Dropped 
or Enrolled Out 


Oompi 6 
Size 






A 


Treated 

Untreated 

Control 


46% 
45% 
63% 


54% 
55% 
37% 


50 
29 
41 






B 


Treated 

Untreated 

Control 


54% 
42% 
52% 


46% 
58% 
48% 


89 
12 
44 






C 


Treated 

Untreated 

Control 


44% 
33% 

22% 

P — ■ 


5bZ 
67% 
78% 


59 
27 
J/ 


r 




D 


Treated 

Untreated 

Control 


49% 
29% 
38% 


51% 
71% 
62% 


77 
21 
37 


s 
^ 




All 


Treated* 

Untreated 

Control 


49% 
37V 
45% 


51% 
63% 
55% 


275 
89 
159 
















'5 

I-;;- 












• 




! 














I 






< 




* 
















.v.. — 


i 

i 
t 

I 










f 
























% 








• :^ 














RJC * 




/J 


92 

tl2 







Table 58 presents high school status information for the fourth 
cohort at the time of its first (and only) follo\7-up. Across sites, 
both the treated versus control and the treatment versus control 
comparisons are statisttwlly signiticant (Chi Squares « 18.05 and 
13.40 respectively, p < .001 in both cases). In both cases, these 
findings are largely attributable to Site D where 80% of the treated 
subgroup have graduated from high school, are currently enrolled, or 
have obtained a GED» This finding, of course, is consistent with 
the full operational status and positive climate that had emerged at 
Site D by the time the fourth cohort enro'lled. 

The control group at Site A continues to present an unex- 
pectedly positive picture with respect to high school status. While 
it is not surprising that the treatment group shows up as it does 
^ (given, the state of program implementation at Site A), the control 
group^ percentages for Site A are significantly more favorable than 
those at the other three sites" combined (Chi Square = 8.65, p < 

Table 58 

ft 

* High School Status of Treated, Untreated, and Control 
Group Members: First Follow-Up, Fourth Cohort 

% Grad., GED, % Dropped Sample 



Site 


Group 


or Enrolled 


Out 


Size 


•A 


Treated 


62% 


38% 


52 




Untreated 


45% 


55% 


47 




Control 


53% 


37% 


46 


B 


-—- -.Treated 


47% 


53% 


51 




Untreated 


" 62% 


38% 


16 




Control 


52% 


48% 


^8 


C 


Treated 


51% ' 


49% 


51 




Untreated 




100% 


8 




Control 


32% 


68% 


• 22 


D 


Treated 


80% 


20% 


95 




Untreated 


68% 


32% 


22 




Control 


33% 


67% 


95 


All 


Treated 

Untreated' 

Control 


63% 
49% 
44% 


37% 

51% • 
56% 


249 
93 

221 



Tables^^ 59 and 60 summarize seconcj-cohort data from the first 
and second* "follow-ups* The comparisons- are between those who are 
either enrolled in some type of school program (high school, col- 
lege, GED, or vocational) or employed (full- or part-t:ime) and those 
who are neither in school nor employed. At ,he time of the first 
follow-up, there are significantly more members of the across-site 
treated subgroup than of the untreated subgroup who are either in 
school or employed (Chi Square = 6.66, p < .01). Six months later, 
however, the relationship is no longer significant. In almost every 
instance, the status of the untreated subgroup is shown to improve 
while the status of the treated~subgroup is shown to deteriorate. 



Table 59 

School/Employment Status of Treated and Untreated 
Group Members: First Follow-Up, Second Cohort 



Site 


Group 


X in School 
or Employed 


% Not in School 
and Unemployed 


Samp le 
Size 


A 


Treated 


62% 


38% 


29 




Untreated 


40% 


60% 


\ 20 


B 


Treated 


60% 


40% 


52 




Untreated 


50% 


50% 


12 


C 


Treated 


56% 


44% 


25 




Untreated 








D 


Treated 


82% 


__^--^-18% — 


" 22 




Untreated — 


J9T ' 


61% 


18 


All 


Treated 
Untreated 


63% 

42% S 


37% 
58% 


128 
50 



114 




Table 60 

< 

School/Employment Status of Treated and Untreated 
Group Members: Second Fol low-Up, Second Cohort 



Site 


, 1 
Group 


% in School 
or Employed 


% Not in School 
and Unemployed 


Sample 
oize 


A 

t 


Treated 
Untreated 


57% 
44% 


43% 
56% 


30 
18 


B 


Treated 
Untreated 


51% 
77% 


"49% 
23% ' 


49 
13 


C 


Treated \ 
Untreated 


68% 

33% 


32% 

67% - 


34 
3 


D 


Treated 
„Uni treated 


65% 
47% 


35% 
53% 


20 
17 



All Treated 59% ' 41% 133 

Untreated 53% 47% 51 



The most dramatic difference between treated and untreated 
subgroups at the,;^ime o'f the first follow-up occurs, at Site D, This 
finding is somewhat surprising in view of the. fact that Site D was 
not functioning well early in the demonstration periods On the 
other hand, all of the second-cohort interns at Site D were dropouts 
and most of those who stayed long 'enough to be counted as treated 
remained in the program for a long time since they needed many 
credits to graduate. Most were still there when the program was 
turned around* - ? 

Site D also shows the largest change from the first -to the 
second follow-up. Most of this change, however, can be traced to 
five individuals who were employed full-time when the first follow- 
up was completed but who were unemployied six months later. Part of 
this reduction can be attributed to the fact that more students are 
employed full-time during the summer (when the firsf follow-up was 
undertaken) than during the school year (35% vs. 29% across all 
- sites). ^Perhaps more important, however, is the fact that the 
•employment situation wati quite good at Site D when the first follow- 
up was undertaken and quite bad six months later. , 

Tables 61 and 62 present the school/employment status data for 
the third cohort treated,* untreated, and control groups. As was the. 
* case with the second .cohort, the across-site treated group is sig- 
nificantly better off than the untreated-~group-at -the-time-of -the 
first follow-up (Chi Square « 5.62, p < .025). The difference, how- 
ever, becomes nonsignificant by the time, of the second. None of the 



<9 



treated-versus-control or treatment-versus-control comparisons is 
statistically significant either at individual sites or across sites 
on the first or the second follow-up. 



Table 61 

School/Employment Status of Treated, Untreated, and Control 
Group Members: First Follow-Up, Third Cohort 







0 % in School 


% Not in School 


Sample 


site 


Group 


or Employed 


and Unemployed 


Size 


A 


Treated 


66% 


34% 


53 




Untreated 


50% 


50% 


34 




Control 


71% 


29% 


41 


6 


Treated 


72% 


28% 


92 




Untreated 


o o o/ 

38% 


62% 


13 




Control 


67% 


33% 


49 ' 


C 


Treated 


73% 


27% 


30 




Untreate(*. 


67% 


33% 


15 




Contrpr 


' 56% 


44% 


27 


D 


.-'treated 


56% 


44% 


66 




Untreated 


50% 


50% 


14 




Control 


55% 


45% 


= 31 


All 


Treated 


66% 


34% 


241 




Untreated 


51% 


49% 


76 




Control 


64% 


36% 


148 



He 



Table 62 



School/Employmant Status of Treated, Untreated, and Control 
Group Members: Second Follow-Up, Third Cohort 



Site 


Group 


in School 

or Employed 


% Not in School 
and Unemployed 


Samole 
Size 


A 


Treated 


60% 


40% 


50 




Untreated 










Control 


61% 


39% 


41 


B 


Treated 


, 61% 


39% 


89 




Untreated 


42% 


58% 


12 




Control 


57% 


43% ' 


44 


C 


Treated 


56% 


44% 


59 




Untreated 


AT? 


Jin 


97 




Control 


53% 


48% 


38 


D 


Treated 


' 4-7% - - 


53% 


77 


» 


Untreated 


29% 


71% 


21 




Control ' 


61%. 


39% 


38 


All 


Treated 

Untreated 

Control 


56% 
- 44% 
58% 


44% 
56% 
42% 


275 
89 
161 



^Fourth-cohort school/employment status data are presented for 
the first (and only) foUow-up in Table 63* Both the' across-site 
treated-versus-control and the treatraent-versus-control comparisons 
are statistically significant (Chi Squares = 9.62 and 10,09 re- 
spectively, p < •Ol in both cases)* These comparison:? are also 
significant at Site D .(Chi Squares « 27.88 and 43.74 respectively, 
p < .0001 in both cases). As was the case with the^ third cohort, 
the treatment group at Site D is significantly better off thau the 
treatment groups at the other three sites (Chi Square « 12*21, 
p < .001)* 






. ..... 


. 







- 








• 


















Table 63 














School/Employment Status of Treated, 


Untreated, and Control 


- 






Group Members: 


/First Follow-Up, 


Fourth Cohort 










Site Group 


% in School 


% Not in School 


Sample 








or Employed 


and 


Unemployed 


Size 








A ^ Treated 


54% 




46% 


^ 52 - 








Untreated 


68% 




J2A 


47 








Cfbntrol 






39* 


46 , 






f- 


B Treated 


61% 




lay 


51 








Untreated 






3\% 


16 


« 






Control 


67% 




33% 


58 








C " Treated 


71% 
50% 
55% 




29% 
50% 
45% 


51 

Q 
O 








/ ^ „ 

D Treated 


82% 




18% 










. Untreated 


64? 




36% 


22 








Control 


45% 




55% 


95 


'* 






All Treated 






31% 


249 








Untreated 


66% 




34% 


93 








Control 


55% 




45% 


221 






K 




























* 




/' " 

r. 










•• 














• 










* 
























■, j 








A 
























- — • 








98 












■ o ^ ^ ^ 






lis 








£^ ^ K K . :ji^:v-r 

















IV. DISCUSSION 



The second interim Task B report (TaMmadge & Yuen^ 1980) 
described how implementation events could affect program outcomes. 
It did not, however, attempt to tie outcome data directly to these 
events* ^-Such an attempt was made in the £££se~nt reporc and a 
surprisingly high degree of Correspbndence was found* 

In a fey instances, outcomes could not be explfl^ined in terras of 



events at the sites. More often, however, they could. Retention 
rates, for example, were high when the programs were running well 
and the site climates were positive. They fell with% remarkable 
regularity at times when implementation, staffing, and/or morale 
problems arose. Similarly, substantial achievement gains in math 
were observed when qualified zaath teachers were present. No such 
gains were observed when, math instruction had to be conducted by 
teachers with other sub ject-Hnatter specializations. 

These relationships between program events and student outcomes 
arB^ptT — and"~should~iict— be, unexpected. It is eminently sensible 
that treatment effects shoi^j-d be observed after effective treat- 
ments. In the case of the present study, however, these relation- 
ships play an unusually important role as one attempts to assess the 
overall value of the CIP. 

% 

, ' There were many implementation problems. They \7ere compounded 
by unrealistic schedules, uncertain^funding, an intrusive evaluation 
design, and a complex, cumbersome, and somewhat ndti-r^spons|ve 
decisicn^making structure. For th^se reasons, one must consider 
what might have been, as well as what was, in order to arrive at a 
fair assessment of the CIP. 

All four of the CIP replications experienced periods when the 
program ,was^ being implemented well. Two of the sites had extended 
periods when, in the opinion of the RMC site visitors, the program 
was operating in a nearly flawless manner. 'All four sites also 
experienced periods of substantial disarray and two* of them were "in 
trouble" during at least half of- the demonstration period — 

The authors of this report, given the ^.j&iifcmnstances just 
described, feel that a fair evaluation of^J^h^^IP must consider both 
the impact of the program when it is^belng , fully implemented and the 
feasibility of attairp.'^g this le^^ of implementation. The latter 
type of assessment is par^i^tuarly difficult to make, unfortunately, 



and depends ^to a lax:ge ektent on subjective judgments mad^ by the 
evaluators. ^^v^erTif one chooses to ignore considerations of 



4 ..... 
A full-blown discussion of the feasibility of implementing 

the CIP is beyond the 'scope of this report. The final Task A 

report (Treadway -et al. , I98I) , howeve^, is devoted almost in its 

entirety to this topic and should be consulted by the interested 

reader. 



^99 



lis 



implementation feasibility, however, ^it is important to recognize* 
that outcome measurements taken when a program is not proper]y or 
fully implemented reveal little or nothing of what would happen if 
the same program were implemented as intended. 

The results of nearly all of the analyses presented in the 
preceding chapter were mixed. If, however, one dismisses some of 
the negative findings as the logical outcomes of poor implementa- 
• ticn, the overall picture becomes substantially more positive. 

» 

Holding Bother 

Tly^ attrition data are not easy to interpret on an overall 
basis. Although the treatment and control groups showed approxi- 
mately equal attrition rates from pre-to-'midtest and from pr^-to- 
posttest, the groups were treated differently^. Treatment group 
students who failed to p^nroll in the CIP or who dropped out or were 
termittated were systematically excluded from subsequent tesf'ings. 
Tljese individuals were automatically added to the attrition list 
even ^hough they might x have returned for testing had they been 
allowed to do so (as alL members of the control were allowed to do). 
When this difference is taken ini:o account, it appears that the 
program \ does have substantial holding power over its participants. 

While the preceding inference is based on somewhat tenuous 
evidence, it was supported by analyses of individual-site attrition 
data. There was a remarkably clear pattern of poor implementation 
being accompanied by high attrition and vice-versa. At least when 
\ ■ the programs were functioning well, it seemed that they did a good 

■^i job of retaining their students 

^ Cognitive Achievement 

\ 'In the. area of 'reading achievement, the results of the various 

analyses were somewhat less positive than had been expected. ^^^The 
across-site and across-cohort norm-referenced gain^.^.-W^e stati^sti- 
_ca lly signif icant at ^oth mid- and posttest tv^^^kt posttest time 
Che feam was at50--4axge^nough (6'.7^ NCEs)^^-to be considered educa- 
tionally significant.. Tlie^'Other__|^inestimates,, however^ were 
disappointing. The main quest ioi^^aised^ by the difference between 
the norm-referenced arid tKg^^H^r gain estimates is which of them is 
the more credible? 

^ An examination of the data, in Tables 3 through 6 reveals that, 
overall^ statist'icaliy significant norm-referenced gains were made, 
not only by the treai:ment ^, group but also by several control and 
comparison groups. Xt was the gains made by these other groups that 
c;au8ed the covariance and standardized-gain analyses to produce 
pfimariry nohsigriif icartt results, since these approached, yield 
estimates that are generally qViite close in size to the difference - 
between the norm^eferenced gains of the treatment group and the*" 



^ 100 



corresponding gains of the control or comparison groups. This same 
relationship also explains the fact that, where the norm-referenced 
gain of the comparison groups wils small (as In the case of the Site A 
Alternative High School comparison group), the treatment effect 
estimate derived from the standardized gain analysis was targe and 
statistically significant (see Table 10).' 

While the relationships between the norm-referenced gain 
estimates and those produced by the covariance and standc^rdized-gain 
analyses is understandable, the questrion remains as to whether the 
nonQ**referenced estimates reflect real gains or are the result of 
some artifact of the study procedure. It must be acknowledged, for 
example, that the normative interpolations and extrapolations 
required by the circumstances of the CIP replications, as veil as 
the assignment of students to grade norms based on their ages, may 
have introduced biases into the porm-referenced evaluation* 

If, indeed,^ the procedures .used to implement the norm- 
referenced analyses introduced bias, then the norm-referenced gain 
estimates are too high* A more accurate picture of the CIP*s impact 
on reading achievement is then nrovided^by t^e other analyses* I^ 
on the other hand, one reject^ the hypothesis that bias was intron . 
duced into the norm-re ferencM evaluation, one must accept the fact\ 
that the gains made by ^x£he third-cohort control and comparison \ 
groups and by the fourth-cohort control groups were real* This 
position,^ in turn, is difficult to accept since there is some doubt 
that the *'treat;ment'* received by most of these groups (Comparison 
Groupc-2^ at Site B is an exception) was effective as that of the 
CljPX'- In the case of the dropout group .in particular ,_there was 
presumably no reading-related instruction whatsoever. 

One possible explanation is that the gains resulted from 
operation of the John Henry effect.' Another is fhat there may have 
been systematic attrition in the control and comparison group§. It 
does not seem unlikely, in fact, that the members of these groups 
whose skills had improved would be more highly motivated to attend 
"the, posttest session than those who had made« ho gains. Such stu*^ 
dents, oF~^our§e, would be atypical representatives of their^ groups 
and would not , therefore, p«:ovide a fair baseline against wnlch to 
me«asure the impact of the CIP. 

It is' likely that a related §prt of seJ.f-selection also oc- 
curred in the treatment group. Ptfsttest data, ^however, were col- 
lected, from very nearly all of the students enrolled in the CIP at 
posttest time. These students were , therefore, representative of 
the group that had received twelve" months of the CIP treatment. 
While they were very likely not representative of the original 
treatmenjt group, it can be assumed that failure to make substantial 
gains in reading was probably not a major cause for attrition from 
the program. For this reason, treatment-group data may be s^ewhat 
less biased than control-group data~at least in the area of reading 
achievement, 

- f " 



101 



121 



Whatever self-se' Jtion may have occurred in treatment and 
control groups during the CIP demonstration most probably resulted 
. from feelings or motivations that were not directly tapped by the 
instruments used in the evaluation* Nevertheless, a decision was 
/made to explore the possibility that those who dropped out of the 
/ groups did differ from those who remained, in terms of the achieve- 
ment and affective measur^ that Oire used. To accomplish this 
task, mean pretest scores were calculated for two groups: those 
individuals who were neither mid- or posttested (and thus:, pre- 
sumably had dropped out) and those who had been either mid- or 
posttested (or DOth). This was done by site,' by cohort and\ across 
sites, by cohort using reading, math, internal-external, and self- 
esteem scores. ' s 

There were 20 across-site analyses, two of which were statis- 
tically significant at the 5% level. In the third cohort, indivi'd- 
u^ls assigned to the treatment group who^ did not remain in the prd^ 
gram obtainetl significantly higher math scores than members, of the 
treatment group who did remain. This difference was primarily due 
to a 9.8 ^CE differential observed at Site A* Neither the across- 
site nor the Sipe A difference appeared in second- or fourth-cohort 
data, >however. 

In the fourth cohort, members of the control group who returned 
for mid- and/or posttesting had ^ significantly higher mean score on 
the Self-Esteem scale thaT> di^ controjl students who failed to 
return. This difference was not present in the third-cohort data. 
There were also no significant self-esteem differences at individual 
sites in eitlter the second or thitd. cohorts. 

At individual sites, there were four additional statistically 
significant (p < .05) differences. Since 80 comparisons were made, 
'however, 4 is the exact number that would be expected to be **sig- 
nificant at the 5Z level** by chance alone. 

The attrition analyses , although they did produce* a few sta- 
tistically significant differences, shed little light on possiblje 
self-selection biases. While it is interesting that ^consistent 
patterns were not found in these analyses , their absencia does not 
remove the possibility that attrition from treatment and control 
groups was systematic. In fact, the authors believe that at least 
'some of these control group students who returned for "^id- and/or 
posttesting were motivated by* competitive feelings, thus producing 
a John Henry effect. 

It is indeed unfortunate th^t so much speculation is required 
for the interpretation of the reading (and. other) results. The only 
data, however, that should be frefe of the various contaminating 
influences discussed above are those used in the^ matched-pairs 
analyses. Unfortunately,* there tli^ sample sizes are ;so small that 
the gain estimates are necessarily unstable. 



102 



The picture was much the same with, respect to math. The 
majority of the norra-referenced and standardized gain analyses 
showed statistically significant treatment effects — at least at 
posttest time. On the other hand,^ only a few of the covariance 
analyses yielded significant results at midtesf^time and none was 
significant at posttest time. The -overall/ norm-referenced gain 
estimate ^at posttest time was 4.3 NCEs, somewhat smaller than the 
gain in reading but still highly significant, is pointed out in 

the Results chapter, the smaller size of the math^ain is probably 
attributable to the difficulty all sites had in hiring and retaining 
qualified math instructors. 

As was the .case with reading, several' control and comparison 
^groups also made statistically significant, norni-referenced gains. 
Most frequently in the case of the comparison groups, however, these 
gains were smaller than those observed in the /corresponding treat- 
ment groups. As a result, all but two of /the comparison group 
analyses showed statistically sigjiificant gains at posttest time. 

-/ 

The gain estimates derived from the mat:ched-pairs analysis were 
smaller than the others and frequently even negative. All of them, 
however, were plagued by small sample y^izes. The third-cohott , 
individual-site analyses are illustrative of the kinds of^' vari- 
ability that can be expected with suc)i small samples. The appar- 
ently large between-site differences ^are almost certainly meaning- 
less as none of the gain estimates is significantly different from 
zero. 

Career Development Inventory 

V 

Most of the ai|alyses performed -on the GDI Planning scale showed 
statistically significant gains both at individual sites and when 
the data were combined across sites. The situation was slightly 
less positive for the Resources scale. On the Information scale, 
the results were generally non-significant at midtest time, but the 
majority were significant at posttest time. 

Care must be taken not to over-interpret the statistically 
significant gains made by interns on the Planning and Resources 
scales. The Planning scale in particular does ;iot reflect ability 
to plan.^ The scale is made' up> of such items as^ "Talking about my 
career decisions with an adult who knows ;^omething about me. The 
student response, "I have not given any thought to this'' earns one 
point while the response, ''I have done this" earns s ix points . 
There 'are various response options between these two extremes thc't 
earn intermediate numbers of points. 

It seems to the authors that "gains" on items of this type are 
more descriptive of the treatment itself than of its impact. It;j is, 
for example, an integral part of the CIP for interns to discud^s 
career objectives, plans, and decisions with career developers. It 
would appear then that any intern who failed to respond, "I have 

103 

: ^ 123 

1 



done tjiis" must have misunderstood the question* Neither the 
question nor the response, however, gets at the issue of whether the 
discussion influenced the intern or was useful in an;^ way. 

Since the GDI Planning scale contains a sigjiificant number of 
,similar items — items that would be expected to show gains simply as 
a result of participating in the tCIP rather than benefiting from 
it--»it must be concluded that the observed gains do not necessarily 
reflect benefits accrued by the "^interns • 

^- The GDI Resources scale is made up of similar items and the 
same argument advance^ with respect to the Planning scale is 
equally applicable* Gains do not necessarily reflect benefits 
accrued ^by ^,he interns. 

The items that make up the GDI Information scale are of a more 
traditional nature; They have correct and incorrect response 
alternatives and tap career-related knowledge. Gains on this scale 
should, therefore, reflect an actual increase in interns* career 
awareness. ' / • «j 

The study conducted by Gibboney Associates ( 1977) .produced 
almost identical findings with respect to the GDI. After 10 weeks 
of program participation, "".there were significant gains on the 
Planning and Resources scales and no gain on the Information scale. 
After a year of program exposure,, however, t^ere were small but 
statistically significant gains on the Information scale. The small 
size of the gains was explained in terms of mismatch between the 
career-related instruction provided by the GIP and the questions 
contained in the test. That argument appears valid — interns learn 
about specific careers that are of interest to them, while the GDI 
Information scale is concerned with more general issues such as 
relationships between aptitudes and types of careers. The failure 
of the Information scale to show bigger gains should not be inter- 
preted to mean^that interns learjied little about careers. A more 
relevant instrument might wefl haV-e s^iovni much larger gains-; 

The gains on t^e Planjjing and Resources scales should not be 
dismissed -as lightly as the preceding comments might imply. , While 
ithey reflect changes in exposure rather than the effects of the 
exposure, the changes are quite large. It is probabaly s^afe to 
assume that the exposure had at least some impact, and an opjtimistic ' 
inference might be that the increased exposure contributea signif- 
icantly to the skills of interns in career planning and in th4 use 
of career-related resources. 

'One final point relating to the Career ' Development Inventory — 
the gains on all three scales were uniformly larger at posttest time 
'than they were at midtest time. This pattern*, which was also 
observed in reading and math, suggests that growth proceeds „as— a- 
direct function of the length of program exposure. " 



Other Non-Cognitive Measures 



Unlike the Gibboney Associates (1977) study, a number of stat- 
istically significant gain estimates were found on the Coopersmith 
Self-Esteem Inventory. More statistically significant gains were 
found on the Self-Esteem scale in the third-cohort analyses than in 
either the second- ^r fourth-cohort analyses. The improved quality 
(compared to the second cohort) and greater amount (compared to the 
fourth cohort) of counseling available to third-cohort interns was 
offered as a possible explanation for this finding. While gains on 
the Self-Esteem sca^le amounted to only a few raw-score points and 
their educational significance may be questionable, the evidence 
suggests that the influence of the CIP on self-esteem scores was 
large enough to be reliably measured. 

Of some 60 analyses involving the Coopersmith Openness scale, 
9 produced statistically significant gains (J, of which favored the 
control group). Since the goals of the CIP appear unrelated to what 
this scale measures, no attempt was made to interpret these find- 
ings. 
'\ 

With respect to the Rotter Internal/External scale, even fewer 
of the gains (4 out of 60) were- found to be statistically signif- 
icant. This finding was somewhat surprising since common sense, as 
well as pn-site ethnographic observations (Fetterman, 1981), suggest 
that long-term part\'**ipants ' in the program should feel increased 
control over the events of their lives. The authors' beliefs on 
this matter are sufficiently strong, in fact, to lead them to 
believe that the negat ive\ result s stem from the fact that the in- 
strument is simply not sensitive to the kinds of changes that oc- 
curred. 



Follov^Up Outcomes 

/ The follow-up data are more directly related to the stated 
goals of the CIP than either the attrition or the test score data. 
One of the program's stated goals is to assist dropouts and poten- 
tial dropouts to obtain their high school diploma. While the 
number of actual CIP graduates from the third and fourth cohorts 
(where control groups were available) was too small to show stat- 
istically significant gains, comparisons between treatment and 
control groups in terms of the number that had graduated from high 
school, were currently enroled, or had earned a GED were generally 
favorable. 

For the fourth cohort, the high school status of the treatment 
g^rbup was significantly better than that of the control group at 
Site and. across sites. This was despite the situation at Site A 
where the control group presented a better picture than the treat- 
ment group (although not* significant ly so) and significantly better 
than the control groups at the other three sites (p < ♦01). 

105 



125 



The third-cohort data showed a significant advantage for the 
treatment group over the control grono at Site C* The negative 
results at Site A, however, prevented the difference from being 
significant overall. When data were combined across the other three 
sites, a significant advantage was again found for the treatment 
group. 

The second cohort had no control group. A larger percentage of 
treatment group members had graduated from high school, were cur- 
rently enrolled, or had earned a GED, however, than was the case 
with either the third or fourth cohorts. This relationship held at 
both the first and second follow-ups largely because the results at 
Site A had not yet turned bad. 

The second stated goal of the CIP to which follow-up data were 
relevant was that of smoothing the transition from school to work. 
Because large numbers of students were still enrolled in school, 
however, it seemed most appropriate to compare treatment and control 
groups in terras of the numbers ""either in scliool or^eniD-lcyed versus 
not in school and not employedT ^ 

The results of these comparisons were somewhat less favorable 
than those related to high school status, but still generally 
encouraging. The fourth-cohort treatment group presented a better 
picture than the control group both at Site D and overall on the 
only follow-up that was conducted on that cohort. There were no 
significant differences between treatment and control groups for the 
thir^ cohort, but the treated subgroups were superior to the un- 
treated subgroups in both the second and third cohort at the time of 
the first follow-up. 

Perhaps a more positive picture would emerge il^we had in- 
formation regarding the quality of jobs that were held. While 
queries were made regarding saUry J.eYels_-and---probabil-rtres for 
advancement, tod few credible tespoffses were received to show stat- 
istically reliable differences between groups. 



A Note on Implementing the Evaluation 

I" 

Ad pointed out repeatedly throughout the report, this* study was 
plagued- by small sample sizes and high (possibly differential) 
attrition rates. While these condi tions ^geriously restricted RMC^s 
ability to conduct rigorous' analyses and to reach conclusions that 
were unencumbered by excessive numbers of caveats, it is not clear 
that milch cciuld have been done to reduce the problems. Recruiters 
at all! four 1 sites left few, if any stones— unturned^jLn-^theix atr„ 
tempts to attract large numbers of students. In fact, their efforts 
to meet contractually specified treatment . and control group quotas 
may havie ibeeri excessively zealous. The authors • impression is that 
some students! were almost literally dragged in and that some of the 
early attrition stemmed from th^ fact that these students were never 
seriously interested in the program. 



106 



126 



• At mid- and ppsttest data-collection times, it was possible to 
test virtually all of the students then enrolled in the CIP. We 
were unable, however, to obtain high participation rates from stu- 
dents in the control and comparison groups* The students themselves 
are highly mobile' and difficult to track* The resources available 
for the study were sufficient to support only one half-time and one 
quarter^time assistant at each site and this manpower level was 
inadequate for the task. We would recommend at least one full-time 
and one half-time site assistant at each location*^ 

Another unanticipated problem was that many of the control and 
comparison group students were enrolled in other schools. Collect- 
ing data from them would have been facilitated had ^e Leen able to 
conduct test ing sessions in the schools . . While™some^ schools were 
willing to cooperate in this manner, others were reluctant--given 
that there were no incentives fox them to do so. The authors 
believe that future studies of this type should attempt to arrange 
an incentive system so that better cooperation can be obtained. 





t. ' ■ . 107 

|i • 0 • 127 



V. SUMMARY AND CONCLUSIONS 



-The analyses presented earlier in this report provide substan- 
tial evidence that the Career Intern Program had a positive impact 
on participating students* Statistically significant gains were 
observed on standardized reading and math tests, on all three scales 
of Super's Career Development Inventory, and- in self-esteem. In 
addition, a significantly larger proportion of the treatment group 
had graduated from high school, was currently enrolled, or had 
obtained a GED than was the case for the control group* Evidence 
with regard to school/employment status was less compelling but 
still generally positive. Finally, there was. evidence that the pro- 
gram was able to retain students — particularly when it was operating 



The issue of implementation is very important to the under- 
rstanding-and^-proper- interpretation of the study results. When the 
programs were not functioning smoothly, absenteeism and attrition 
were high and achievement gains tended to be low. Similarly, when 
programs had to operate without .a qualified math teacher (even if 
all other aspects of the program were working well and attendance 
was iiigh) students failed to make significant gains. Gains in 
self-esteem appeared to require both extended involvement in the 
program (they did not emerge until posttest time) and extensive 
contact with qualified counselors. 

Relationships of this type were^ fairly obvious in the data — 
perhaps because all of the sites experienced substantial implementa- 
tion problems at various times during the demonstration period. In 
addition to highlighting relationships, implementation problems also 
produced negative results. Thus the data should not be taken as an 
accurate gauge of what the CIP can do.* The existing evidence sug- 
gests'that the program would have had substantially greater impact 
had fewer implementation problems been encountered. 

The initial success at Site B and the delayed but ultimately 
outstanding performance at Site D stand as testimony that the pro- 
gram can be implemented effectively. The outcome data from those 
sitea ^t those times are overwhelmingly positive and would seem to 
provide the best estimate of what the CIP can accomplish. 

In addition to p^rob lems resulting from incomplete program 
implementation, the evaluation was hampered by very high attrition 
rates* At least to some extent, the high attrition resulted from 
the need to meet contractually specified enrollment quotas that were 
unrealistic for new and uhproven programs. Many students assigned 
to the treatment group never even enrolled in the program while 
substantial numbers of others dropped out almost immediately. In 
any case, one major "consequence of the high attrition rate, was the 
threat it- posed to the internal validity^ of the treatment-control 
evaluation design* 



well. 



109 




Because of hazards associated with randomized experiments when 
attrition is high, several other evaluation strategies were also 
employed. As it turned out, the different strategies yielded some- 
what different islsults* In reading and math, for example, the 
findings of the norm-referenced evaluations were .substantially more 
positive than those of the cDvariance and standardized gain analyses 
which used control and comparison groups respectively. In reading, 
the norm-referenced gain estimate for the 280 stude--»ts in the 
combined third and fourth cohorts was 7.4 NCEs (from the 24th to the 
36th percentile) while the corresponding covariance estimate was 2.6 
NCEs. In math, the corresponding gains were 4.3 NCEs (from the 12th 
tothfe i7th percentile) and 1.4 NCEs, respectively. 

The reason for this difference derives from the fact that the 
control groups also showed posit ive (norm-referenced) growth in 
reading and math. It is the authors^ opinion that these gain 
^estimates did not- .arise from biases inherent in the unusual manner 
iri^which the norm-referenced evaluat ion had to be implemented huV 
rather are real. We also believe, however, that the gains did not 
result from any instructional treatment the control group members 
received but instead from some combination of a self-selection bias 
(65% of the control group^members chose not to participate in the 
posttest ing session despite a monet ary incent ive of approximately 
$20 to do so) and a John Henry effect. The plausibility of the John 
Henry effect, in turn,^ derives from the fact that all members of the 
control group sought, but were denied, admission to the program. 

All three scales of the Career Development Inventory showed 
several statistically significant treatment effects in individual- 
site analyses. Across sites, the gain estimates were significant 
in over half of the cases. 

Statistically significant gains in self-esteem were observed in 
half of the covariance and standardized gain analyses at posttest 
time but in none of the midtest analyses. It was inferred that a 
substantial amount of treatment is required to effect gains in self- 
esteem. 

Very few of the analyses involving the Rotter Internal-External 
scale produced statistically significant gains. The authors^' own 
observations, however, and the ethnographic analyses reported by 
Fetterman (1981) suggest that this finding is misleading. It seems 
far more likely that CIP students did gain a feeling of control over 
their lives from the program but that the gain failed to manifest 
itself in the test scores. 



Several of the other instruments used in this study seem less 
than optimum in retrospect. A particularly salient example is the 
Informatio,n, scale of the Career Development Inventory. While 
statistically significant gains were made on this scale none of them 
exceeded two raw score points. 



110 

129 



All CIP students participate in a semester-long career coun- 
seling seminar. In addition, a career-development plan is worked 
out for each intern. The interns research two career fields in 
depth and participate in two week-long, Hands-On job experiences^ 
It seems impossible that the total impact of these learning experi- 
ences can be reflected by two raw score points. While no more 
appropriate instrument may be available, it is nearly inconceivable 
that a better, more relevant one could not be developed. Where the 
future funding of a program may hinge on the results of an impact 
evaluation, it seems of utmost importance to employ tests which are 
relevant to the goals and curriculum of that program. 

As regards relevance, it is important to point out here that 
gains on paper-and-pencil tests such as were used in this study is 
not a major objective of the CIP, Such gains are, at best, inter- 
j mediate objectives that may or tnay„not be highly relevant to the 

t pfogran's primary goals of helping participants earn their high 

zj.. school diploma and enhancing their employability. Other data, 

however, strongly support the CIP's success in achieving the first 
^ of these primary objectives and provide at least some support for 

i " success in the second. 



Comparability of Designs 



Each of the designs used In this study provides an estimate 
of the CIP^'s Impact on participating students* That estimate. In 
turn, Is derived from an evaluation of student performance after 
participating In the program and an estimate of what that_per-__ 
formance would have been had the students not participated. The 
post-part Iclpatlon assessment Is the same In all designs, but tba 
"no-treatment expectation'* differs* 

There Is no way of knowing exactly what those students who 
participated In an experimental treatment would have done had they 
not participated* It Is generally accepted, however, that a good 
estimate of that performance can be obtained from a similar group 
of students who did not participate. The credibility of the 
estimate, of course, depends heavily on the extent to which the 
two groups are similar* 



True Eacperlnrents 

Randomly assigned groups > One experimental approach that is 
often used for the purpose of assuring comparability between 
groups is to randomly assign students drawn from a pool of poten- 
tial participants to treatment and control groups. Any diff«- 
ences between the groups that result from riandom assignment can, 
presumably, be adjusted for through use of covariance analysis* 

This 80-called classic or ''true'* experimental design provides 
unbiased estimates of treatment effects and is generally regarded 
as preferable to -any quasi-experimental design (such as the 
norm-referenced design)* Unfortunately the Integrity of the 
design can be destroyed by attrition* If the students lost from 
one group are systematically di£feren|i^\from those lost from the 
other, the remaining groups are no longer randomly equivalent, and 
covariance analysis can no longer adequately adjust for betx^een*- 
group differences* 

The matched-pairs design * A variation on the random experi- 
ment 'is one. in which pairs of students^are formed prior to the 
assignment process in such^ -way that their members are as much 
alike a8_ possibJLe In all! ways^ relevant to the experiment* One 
member of ^ach pair is tmsn selected randomly Itov assignment /to 
the treatmiint group whil^ the o^her is assigned to the control 
group* ' ^ , > \ 

If the matching Is good, initial differences between groups 
should be close to zero, thus obviating the need for any analysis 
of covar lance-like adjustment. Furthermore, if both membera of a, 
pair are discarded when either member is \oat through attrition, 

91 



I 114 




the remaining treatment and control groups will still be randomly 
equivalent* /^art from the practical di££icult.ies associated with 
implementing^ this design, its only real drawback is that it is* 
more severely affected by attrition than its less sophisticated 
counterpart* For example,* if attrition results in the loss 
-of—AOX— of—aH— student8-,^t will result in the loss of 64% of-^^all 
pairs* The samples remaining/for analysis would thus encompass 
60Z of the original groups for the simple random design but only 
36Z for the matched-pairs design* 



Adjustments for Initial Differences between Groups 

Both of the true-experiment designs assess treatment effects 
through use of a no-treatment expection derived from a sample of 
students believed to be equivalent to those who actually partici- 
pated in the treatment* The matched-pairs design does this in a 
straightforward manner by simply comparing the post test perform-, 
ance of ' the two groupsy The simple random experiment, on the 
other hand> may frequently require that an adjustment be made for 
non-trivial pretest performance differencJes between groups result- 
ing from the (unmatched) random assignment process* 

* ^ ♦ 

The assumptions underlying the adjustments that are available 
to the evaluator may not be met under even the best of conditions* 
Tuuy become increasingly problematical when attrition is high, 
when there are reasons for suspecting the existence ot real 
differences between the treatment and control groups, or when 
assignment to the control group may itself affect the behavior of 
the students* 

Under conditions where assignment to treatment and control 
groups was indeed random and where there was no attrition from 
either group, analysis of covariance procedures are considered 
most appropriate to adjust for whatever pre-treatment differences 
may exist between groups* In the typical two-group (treatment >^nd 
control) situation, the covariance adjustment entails multiplying 
the difference between the groups' pretest ,means by the slope of. 
the common, within-group regression line* (The within-group line 
is used under the assumption that it is a more accurate and 
stable estimate of the population value than that provided by 
either group separately*) The result of this calculation is then 
used to adjust the posttest means of the two groups* 

One major assumption underlies the use of a common, ^within- 
group regression line* It is that the two groups are' random 
samples' from a single population* If assignment is random, this 
assumption is, by definition, wet a t pretest time * The treatment 
may however, affect both the mean and the variance of treatment 
group posttest scores* Under these circumstances, it seems 



115 



133 



inappVopriate to regard the two groups as random samples from a 
singly population at posttest time. Furthermore, since the 
slope\6f the regression line is partially determined by the 
variance of posttest scores, it becomes seemingly inappropriate to 
calculate a cotomon, within-group regression line. 

If\one does not use a combined, within-group regression line 
to adjus^t the mean posttest -^scores of the treatment and control 
groups, two other particularly interesting possibilities exist* 
The treatment group's regression line could be used to predict 
what that\group's posttest scores would have been had its pretest 
score beeii the same as the control group's. Alterna'^ively , the 
control group's regression line could be used to predict what that 
group's posttest score. woiitd have been had its pretest score been 
the same as the treatment group's. Gains would then be calculated 
by comparing the predicted posttest score of one* group, with the 
observed posttest score of the other group. 

^. The gain estimate derived from projected treatment, group 
posttest scores will be different from the one based on projected 
control group scores unless the j:wo regression lines are exactly 
parallel. The amount of difference between the two gain estimates 
will be a joint function of the difference in regression line 
scopes and the difference in pretest means. .In some instances the 
tWo gain estimates will differ substantially from one .another. 
Unfortunately, thpre is no way to determine where "truth" lies. 
It is perhaps best tn regard the two estimates as boundaries 
defining a' range within which the true gain is likely to fall. 

All of the covariange^ analyses included in this report were 
calculated three different ways^: oae using a common, within-group 
regression line; one using the treaftnent group's regressi'09 Aine 
in the manner described aboye; and oqe using ^the control .group's 
regression line -(also in the manner described above) . For treat- 
ment-vetsus-c'ontrol group comparisons, the table's in the Results 
section present only the findings of the standard covariance 
analysis using the common, within-group regression line, v However, 
where* the other analyses yielded results that were subSjtahtially 
different, they are discussed in the text. 

It was mentioned earlier that an important assumption under- 
lying standard covariance analysis procedures Is that the grottps 
being compared be^ random samples from a single population. Where 
systematic differences are known to exist between the groups priov 
to the beginning of the experiment, covariance analysis is thought 
to systematically underadjufit for pretest differenced (Campbell & 
Erlebacher, 1970)* Under these circumstances; some form of 
reliability-corrected covariance analysis (Porter, 1967) or 
standardized-gain analysis (Kenny, 1975) is generally considered 
to be more appropriate. 



116 



0 

The present study employed standardized-gain analyses in all 
situations wl^ere covariance analysis was also employed* This type 
of analysis Is exactly comparable to covariance analysis except 
thsi it makes \|ise of the principal axis of the bivariate distribu- 
tion^ of pre- 'and posttest scores rather than the corresponding 
regression liuft-* Because three versions of each covariance 
analysis were carried -out, the corresponding three versions of 
standardized-ga4.i^ analyses were also conducted (one using the 
combined, within^group principal axis; one using the treatzient 
group's ^rincipal\ axis; and one using the control group's t^i^inci- 
pal axis). \ 



Considering tne covariance and standardized-gain analyses 
together^ six different gain estimates were calculated for each 
"experiment" (e.g.,^^Site M treatment group versus regular high 
school coi^iparison grout>)* The question immediately comes to mind, 
"Which of the* six estimates most accurately reflects the true 
impact of the program?" If the answer to that question were 
known, of course, tl?ere would be little point in c^lctrlating 
the five less accurate estimates. The answer is not knoi/n, 
however, and therein lies; the justification for the multiple 
analysis approach. 

If * the smallest of the gain estimates we^e statistically and 
educationally significant, one would have a high degree of confi- 
dence^ in labeling the treatment as effective. If' five out of t)ie 
six estimates were not statistically significant, one would,^^^ve 
to adopt a more, conservative stance. The number of statistically j 
significant estimates thus provides a crude indicator of how mi'ch 
confidence can be placed in the inferences one draws from the . 
analyses. While not "scientific" in any strict sense of the word, 
considering all six estimates simultaneously is almost certainly a 
better approac^h than selecting ones as the "best" because the 
circumstances of this study ere such that the assumptions of all 
of the analyses are violated more often than th^y are met. 



Quasi-Experiments ^ ^ 

Because of the high, and probably differential^ Attrition 
that occurred between pre- and posttests, it is not entirely clear' 
whether the treatment-control comparisons made in this study 
should be regarded as true experiments or nof. On the other hand, 
the comparisons made between the treatment groups and the spe- 
cially selected comparison groups' at each site cannot be regarded 
as true ei^periments. They are best categorized as a class of 
quasi-experiments called the non-equivalent control group design. 

The non-equivalent conr " 1 group design . As pointed out 
above, the comparison (as opposed to control) groups used in this 



^35 



dtudy cannot be considered random samples from the same population 
from which the treatment group .was drawn* It Is to be expected 
that they differ from the treatment group In systematic ways andv^c* ^ 
are samples from different populations. For this reason, the ^ 
Btandardlzed-galn approach was considered preferable to the A 
covarlance^-analysls approach as a strategy to adjust for pretesJu^V 
differences between groups. The treatment-versus-comparlson group 
analyses- ln^ the tabular presenudLlons of the Results section of 
tLis veport thuo reflect that mode of analysis.' 

\ As was the case with* the treatment-versus^-cbntrol comparisons 
(where analysis cf- .covarlance " results are presented in the tab- 
bies), however, anaJlyses were conducted using all six of the 
adjustment 8tra|:egie8 described eaVlier in this .appendix* Where 
results- from the other analyse^ differed substantially from the 
standardized-gain results, theoi^ferences are discussed in the 
text. 

It should be pointed oiit that qua%i«-experlments attempt to 
provide answers to questions that are somewhat different from 
those addressed by true experiments. The latter generate esti- 
mates of what the treatment group's performance would have been in 
^the absence of tH)& treatments Qua8i«*experiments simply comp^e 
the' posttest performances <gi the treatment group ^wj|.th that -of 
another, similar group. They either assume that the groups were 
equal in pre«-treatment performance levels or they statistically 
adjust post-treatment measures^ to compensate for pretest dlffer- 
ences. 

The assumption is often made that the posttest (or' adjusted 
posttest) performance of Ibhe comparison group provides ^good 
approximation of a nortreatment expectation for the treatment 
grdup. It would be more prudent, however, to acknowledge that 
^ua8i«»experiment8 really address, the. question, "How much bettex: -* 
C9r worse) 3Wild the treatment group fiav^« performed than ^he 
coiq>arlson group if • the two groups had ^t^rted« out equal?" If 
that orientation is taken, the^ obtained results can be inter- 
preted in term^ of the similarities and ^differences between the 
groups and additional insi^h^mfty be obtainad. 

The norm-ref erence^d d e sran . The norm-referenced design 
assesses* treatment effects in terms of changes in status with 
respect to fhe national no^rms from pre- to posttest. If a group's 
m^an pretest score placed, it at^ the 20th percentile prior to ^ 
particlpatipn in tHe. program being evaluated, and its mean posttest 
sclttre plithed it at the 25th percentile, tl\e 5-percentlle gain- 
would be Iftttribute^ to the effect oiE the treatment. In essence, 
. the design! i^cofl^p^res . the growth HE treatment-group students with ^ 
.'Students at v^^C ^a^^^ pretest achievement level attending a na- 
tionally reprMentative sample of. schools. 



The design does ^not normally provide a local no-treatment 
expectatfion since treatment-group students » i£ they did not 
participate in the treatment » vould not be attending a nationally 
representative sample o£ schools • While » £rom some perspectives 
this characteristic o£ the norm-re£erenced design might be viewed 
as an , advantage » it does make the design systematically di££erent 
from^designs that* use local control or comparison groups. 

The evaluation findings presented iH the Results section o£ 
this report show .several instances where substantial di££erences 
exlflJLbetween the nom-ref erenced gains and the gains derived £rom 
control or comparison group analyses. In these cases, it is 
interesting to examine the norm-re£^renced gains made by the 
control or comparison group (these gains are also included in the 
tiables).' V 

^Subtracting xthe nonii-re£erenced gain made by the control or 
comparison-group from the norm-referenced gain made by the treat- 
Metkt group yields a treatment-?effe\:t estimate t1iat very closely 
approximates the estimate derived from the corresponding covari- 
ance or. standardized<»gain analysis'.^ When used in this manner » the 
norm-referenced modf^l-^does provide a local* mo-treatment expecta- 
tion* The feature that is thtt primary contributor to the design^s 
desirability) hpwever/ is its ability to produce a gain estimate 
without requiring a control or comparison group. Under these 
circumstahces/ of course » it does not provide a local no-treatment 
expectation. ^ - . 



/ 



137 



I:-" 
f; 



APPENDIX B 

Selection of the Achievement, Test to be Used 
In the CIP Evaluation Study 



t'-c 



^ 138 

121 



SELECTION OF THE ACHIEVEMENT TEST TO BE USED 
IN THE CIP EVALUATION STUDY 



The test used to evaluate the achievement gains produced 
by the CIP should possess several important characteristics. 
To conduct a ^norm-referenced evaluation the test must have 
empirical* normative data at grades nine, ten, eleven, and twelve, 
based on ^nationally representative samples of students. To 
be sensitive to project ^impact, the content of the tests should 
not be uninteresting, esoteric, or irrelevant to the students 
in CIP. It should reflect as closely as possible the emphasis 
f the CIP instruction. ^The level^of test selected should 
be appropriate for the functional level of the students. The 
test should not be so difficult that the average score of the 
group tested is at chance nor should it be so easy that, on 
the average, students answer more than 75Z of tlie items cor- 
rectly. It. would also be desirable for the test to have empirl*^ 
cal normative^^ata at more than one point during the year. 
The number of te^t items and time required to take the test 
should fall witlTln reasonable limits and the format of the 
test booklets, ^hould be attractive and easy to follow. 
J 

. In .the ^review process the following tests wei<e examined: 
California Achievement 'Test (1970 and 1977), Comixrehenslve 
Tests of Basic Skills (1968 and 1973), Diagnostic Mathematics 
Inventory (1975), Gates-MacGlnitle Reading Teste (i964), Iowa 
Tests of Basic Skills (1971), Metropolitan Achievement Tests 
(1970 and 1978)^, Prescriptive Reading Inventory (1975), Sequential 
Test of Educational Progress (1969), SRA Achievement Series 
(1971), and Stanford Achiev^nent Tests (1973). - , 

Of this group', only five tests were found to have normative 
data at grades nine, ten, eleven, and twelve. Specifically, 
the California Achievement Tests (1970 and 1977), Comprehensiye ^ 
Tests of Basic Skills (1973), Metropolitan Achievement Tests 
(1978). and the Sequential Tests of Educational Progress (1969)\ 
fulfilled this requirement. j 

Each of the five tests was e^lcamined in detail. The times 
of the year when the test was normed and the forms that are 
available were noted. The level of the test intended for high 
scnool students and the next lower (or easier) level of the 
test was determined. For each level, the number of items in 
each subtest, the time required to take the test, and the length 
and topic of each passage were listed. A summary of this infor- 
mation is provided for each test (see Figures 1 through 5). 
* t* * 

This review^ revealed some significant differences among 
the f.lve tests. The passages in the STEP II subtest are longer 



133 



4 6.7, 7.A, 7.7, e.4, ^.7, 9.4, 9.7 A & B 

5 V ^9.7, io.4; 10.7, 11.4, U.7^ 12.4, 12.7 A & B 

L»vtl 4 

Coocttpts & 

aafdln« Vocab. Co«p. Hath Coiro. Problw 

Ho. of It«w 40 45 48 50 

Tatting TiM% 10 40 v 28 23 



Concepts & 

Raading Vocab. Co«D. Math Co«d. Probletts 



Ko. of ItaM 40 45 48 50 

Taatlng Tlaa 10 40 33 22 

(■in) 

Contant of Laval 4 - taadlna Subtaat 

Vocab* " — — ^. — 

2- or 3-iiord phrasas, find syaonya for work in boldfaca 

laadina Ccmo^ 

bmpla of Tabla of Contants 
fieaspla oflndax 

5 paratrapbs • coipdaition of planat aarth, volcanoaa, aarthquakaa . 

7 paratrapha - paasaf a about tha naad to conaarva raaourcaa 

4 paratrapfaa « tha laaar— ita-hlatory and uaa 

2 paragraphs - logic atataaaata— diagram of a "atatamant of ordar 

Contant of Laval 5 - R aa^tna Subtaat 

Vocab . ^ * . 

2- or 3-wrd phraaaa, find »ynony« for word, la boldfaca 

^atlOM about using a book— glosaary, appandlx, bibliogr.^phy 
5:patacraphg - tha sctantlflc mathbd va» authoritarianiaa 
- < SUong p«rf«rgpha Bill of Ughta ^ . 

'4 paragrspht - atudylng tha ocaan floor / 

4 paragraphs apt^tuda aaasuraa— klnda, uta of raiulta 

7 paragrapha - logic atatawnts— i* - i nonwl; i^' - i ^noraal thnn 



Figure 1« Summary of concent and other> characteristics of 
the^ California Achievement Test (1970) 

/ 



' 123 

















• 






- 






EMlrlcal Nondnjt D«tO _ lofflft 
18 7.7, 8.1, 8.7, 9.1, 9^7, lO.l C & D 


5 




19 9.7, lO.l, 10.7, U.l, U.7, 12.-1, U.7 C D 


'}> 




Concepts 4 

]lffdlnff VotfAb* Ccmo. Hath Co«o. Probl«««_^ 

Ho. of Itm 30 *9 ^ *2 « 
Ttstlng Tim« 10 35 25 35 
(«ln) 


w 
\ 




19 


I 


' 


' ' 'Concapcs * 

n^fftdittc Vocib. Cow>* Math Co«o. Problw 
Ho. of Xt«M 30 ^ 40 ^ 40 45 
Xasting Tl«* 10 35 25 - . 35 

(•la) ; ^ ^' 






qontant of Laval 18 - Raadlna Subtaat 


i 


fei - 


" — Vocabulary 

2«' or >-iidrd phrasas «ra praaantad . Studant it to find aynoaya ^ 
/ of' undarllAdd i#ord In phraaa 


•V 




Em41»I • th« atory. of. Maria Mltcball, tha aatronotwr (haa a plctiir«) 
* 1 parifrapli^- riaid cotmmtcUX advartUlng -V«llay Mualc Stora 

2 p«ra(rapKs * iuaa«Ma's:'apaftcb offarlAg a $3*00 aurprlaa 

4 stanMt - poa«Mri>Ottt •tonia _ ^ " ^ v 

4 parSirtphi - hiitory of gulur (pic. of lnatr«i«ntt pracadlng tha guitar) . 

3 p«r«gr«ph«.- nwpapar «tlcla about propoaad touta for attta highway and 

lattara wrlttan in raaponta-^1 pro, 1 con 

4 paragrtpht - capttln't log daacrlblng trip to raacua aurvlvora 


f 
J 




(Jontatit of t«val 19 • RaadlnK Subtaf t 

✓ 






Twrtvlfui ■ 

3«aa aa Laval 18 


- > 




*"**7SSmr«pht - report of a dc«a»-dr»tt»d in ft tlMp and dra« |*b-Chaa 
faatftty plctura) 

3 pftragrftphft - adltorlftl about iaportanca of aatlng^futurftl fooda 
' 3 Dftrfttrftpha - •©•ecli glvan by high school atudftttt ftbout contributing to 
atudtnt coiittftity Iftrtga (pic. of ftudant^addraaaing group) 
S^patftgtftptift - daacrlptiptt of aun* ftolar anargy, ftnd ftun'ft rftya 
3 a tftssftg ^ • ftbottt: akyicrftp«r a 

6 long pftrftgrftpha - work and lif a of Oroxco tha ftrtitt 
1 pgrftgrJiph «* rftdlo ad ftbout Tuf f Tftpa 

1 


' i 
' \ 

\ 


pi." ' 


Figure 2* Summary of content and other characteristics of 


' 




the California Achievement Test (1977) ^ — - 


, 5 


: - ■ ■ 




' \ 

''4 


^ • y- 


124 











/ 



JtSZil gapirlcAl l^orlM D^f « For— 

3 6.7* 7*7, 8.7 S 4 T 

,4 8.7, 9*7* 10.7, IK7, 12*7 S & T 

Conctpcs & 

Vocab* Co«d> K>th Co«d> ?r9ttfa« 

No. of ZtMS 40 4^ 48 50 

Ttstlss TiM U 35 ^ 40 ^35 



Conetpcs ,& 

I«ading Voc<b» Cow* Hath Co«d« Probltf 

No. of Zttu 40 45 - 48 SO 

Ttstins Tlat U .35 40 30 



(«ia) 



CoBtOTt of L«vl 3 - Rfding 8<ibf »t 



Find t)^^!!!* ^ , 0 

5 paragraphs - girl viXliag to kaap bar proaiaa to' babyalt «v«n though 

jha^imuld^ tha^rock-faatlvaX 

— l~j^4irii|]tiipSir • vary diffleult'^ paragraph about ability to tall history 
of ao abimdonad fars by studying landseapa 

2 paragraphs about ^thm affifetg of a Mtaorita eraahlng to •arth\ln 1947 
Svlaaing pool rulaa^^MStlons, vith rasults. of braaking rulas 

5 paragraphs - story aboutVaff orts of ^^unlor high school stud ants to «aka 

coMMaiity ainira of pollution, ate* through "aarth day" 
i paragraphs - about changa' in Zngllsh languaga froa tiaa of Old English 

3 stansas a poM j^odt autUan ! i 

' . ^ Contant of yiva; t - Kaadina Stfibtitt 

Salaet synonyms 
loading Cow> 

5 paragraphs « thoughts of ^risnar bafora ha avitts his raear^200 buttarf ly 
3 paragraphs - discussas* tha id.aa of "huMsanass** in «ni»aXa and objaees 

5 long paragraphs •'Ka^garat Maad's study of Saaoan eultura*««ays in vhieh 

individuals laarn Valuas fro« ^oup 

6 paragrs^f - eoMwaas^raason for davalopnant and thair ^vantagas 
Poas - axprassing ayapath7 with eagad birds 

5 paragraphs - tha iany^ehoicas of farad to high school graduatss in tanu 

of fuir^'har adueation \ 
3 paragraphs - shravs hunting for food ^ 



Figure 3. Summary of content and other characteristics of 
ComprifehenslviB Tests of Basic Skills (1973) 

\ 



125 



L^vl irieal ltor«liig Dttf _ £sm 

Adf. I 7a. 7*7; 8a, s*7, 9a, 9*7 Js ts 

Adv* 2 ' loa, 10*7, ua, u.7^ ua, u»7 js a 

^•^1 Adv. I tfYtl A^^r ? 

t #Mitlii« Coao . Hath t^aklM Cq«d. Hij!L_ 

WO. of UmB '-^Jf^-™ ""^g S 1; 
-Notinr-TlM W «^ 30 40 



imin) 

1 

Cipttiit of Ad T^- j - tiding aubf tt 

1 foraftAph. - iMissM* '^.^^ iur««iodo 
2.p»r«itoib« «kottt skis divlag 

i'oAtMroflia « pi«Ms« itumtlioiAg tha city stroots ss « ploytround 
< «id tbo boatfiti of sports sctlvltlos 

2 Mi(i«rtpho • vory slspls sMMitT of ShokAspomro'ii PtTWII ot4 Thtrtf 
4^p«OftipliO'«^ fotwtloitlof lhorlock_|tel^^ clubs 

^4:p«itt*plMi ^-i<rt-»-ro*ctl^^ ChristM* gift thtt U * 

gmt^diMpPoittOMftC 

i pirsirtplii • liw*^^^ 

I porggrftph « lUrr 8lioU«y> vritiiMI of frwfcWtfttt 

3 p4r«gr«phs • toowdo do Vliicl— Ufs.tad vork 

\ Citiitoat of Ad Y; ? - F^^diM Subttst 

— 

^~j!r:!5?p;j2ifopiis otwoto 

3 Mtiiraphs • •it'hl^toty of .posonords to idsotlfy frlsnds vs. 

* foot— •••hibboloth'* 

' 3 paratraphi^* dovalopiUiir^f Monopoly gsM 
' I paraftapb « dlwui throifliig--liicludM jwny. ntaibsrs sbout <sis«, 

dlstagco,' «tc. * t- t ^ 1 

3 paragraphs - ♦•faalllar straagars'*, dofloltlon. rasuXts of psychological 

study of ^coaawtars 
I paragraph - affacts of ^1<illd and watar o« aarth and traaa 
I paragraph - uapopulat boy idio la a booki«r« 
, ^ paragraph - daacrlptlod ot hoatala 



Figures 4. Suotmary of contend and other characteristics of 
t^tropolltan Achievement Test (1978) 



126 



2 
3 



9*7^ 10.7, 11*7, 12*7 
fr*7, 7.7, S.7 



Vflg^buiMgir f ading co«D V9trt^irr Ktrttni 

^30 30 30. -===:zz4^=r— 

TMtias tixm 15 30 15 30 

' («itt> 



Mo* of Xt«M- 



^^^fw0 typ«f of ttMt SMt«nc«f prtmtti «d ««coiid •wtwc* «uit 
b« c«ipl«t«d wiiBi <»• o£ four cholc««* Word ustd 1ft . 
mi ttnd its synooytt 

9 MrSrapht - frp»|Ch*rUt OlcUni' lijii02att-h«i o d-f«hloutd 
4UiMu« 

8 ttaaMt • do|lttd"lia wfi fn«id,"hav« fight, dog bitM 

but dot iUn ^. 
3 loii« i«r«tr«phs - group* in j^wt •«y,b« thought mor« noblt than 
. : th«y vi«(«d by tbtir coot«por*ri«f (•*gM 
\'^ight«^> 

— 4 lom »«r«gT*ph«^^u«« oriiyabolg ^ - j 

DiiloJ froiT* pXty - ido«yacr»ci«i of . vlU that .uit b« £ul£iH«l 
la ord«t to inhtrit th« »0B«3r 

conttfit of Laya l^ 3 - Midiag gvt^9i^ 

Vocab . 

Um% aa Lavftl 2 

' *"*^ftS5MP*»« - 4iico»ary and iiaa of gUaa to nagnify objaeta 

9 .par^rapha - tha aj-bty of brphaua froii graak nythology-tha 

portanea of wn^it 

5 parggw^i •^^'^ conpoaition'ofrglaaap glaaablojrtng 

6 paragra^t • hiatory of \yiatn«aaaa paopla 

7 aUttsatf • pOM about forgatting . ^ 

7 patagrapha - Sout J^^aappltt of young Oilbart nho latar bacona. 
' - eoikpoaar cf Oilbart 4 Sttlliw fana 



Figure 5# Sunaary of content and other characteristics of 
Sequential Teats of Educational Progress (1969) 



127 



than the passages of the others » and the content appears more 
, difficult. The STEP II norms are based on the performance 
of students who were tested almost ten^yeavs ago. Using "old" 
norms may produce misleading achievement status information 
in norm-referenced evaluations. In addition, empirical data 
are provided for only one time of the year. Of the five tests, 
the STEP II appeared to be the least desirable. 

A drawback o,f__tiie_CAT '70 is that reading passages of 
both levels include questions about using parts of -books (table 
of contents, Index, etc.) to. find information. These questions 
would seem to be more appropriate in a subtest covering reference 
skills rather than reading comprehension. In addition^, the 

reading- s ub^^^ a — present d iHgram s of logic al"" relationships 

from which the students are a&ked to draw logical conclusions.^ 
Thia may be a foreign task to many students. Finally, since 
there is a more recent edition of the CAT it would be preferable 
to use the 1977 edition instead of the 1970. For these reasons, 
the CAT 70 was felt not to be the best test to use for the 
evaluation. 

For^the CTBS '73, the passages in Level 3 (the level we v 
would most likely us6) are ordered so that two of the more 
difficult ones are presented first. This order of presentation 
may discourage students so that either they will not respond 
to the remaining items or they may respond at random. A second 
drawback of the CTBS '73 is that empirical normative data are 
available for only one month of the year. 

The MAT '78 ^and CAT '77 are the newest of the achievement 
tests reviewed. Both tests have empirical normative data for 
October and April* A cursory examination of the content of 
the reading tests of both the MAT '78 and the CAT '77 showed 
that either one would be appropriate to use in the CIP evalua- 
tion. ^The> passages in the CAt '77, however, seem to be more 
relevant and inherently more interesting than those of the 
MAT '78. For example, the radio advertisement passage, the 
salesman's speech, and the newspaper editorial all present 
material that reflects "real world" situation? that students 
are likely to have encountered. Of cr rse, it also has passages 
that are probably of less interest — the 'story of a woman 
astronomer J the history of the guitar, and a poem about storms. 
The majority of the passages in the MAT '78 deal with topics 
that would not be of concern to CIP interns. For example, 
there are passages about marmalade, skin diving, and Leona'rdo 
da Vinci. 

At a more detailed level the two tests were studied in 
terms of the instructional objectives that each test attempts 
to measure. In each test's manual, the instructional objectives 



* 128 



upon which the* test was constructed are listed and the test 
ItcttS that tteasure each objective are Identified. These are 
presented below In tab les 64 and 63« Although the objectives 
silected by the two publishers 1^b ~ not matchTT)erfectly, b 
.lapsing so«e stib*obJectlves and relabeling others^ It Is possible 
to^^ke comparisons between the testSt (It should be noted 
that the MA¥ '78 does not offer a separate vocabulary subtest. 
Vocabulary ,lte«s are Included In the treading comprehens^n 
section*) Direct coMparlsons can "be made betwpen^ the two tests 
as.^to ithe number of vocabulary Items each contains and the 
"numbur of -l^ems-asfclnii~-f or-llteral^ I nf ormatlon* ^^. Af ter exralnlng 
-the -tear Items, the HAT Inferential category oT objective 
appears to be equivalent to the CAT Interpretive category, 
and the' MA.T evaluative category appears to be equivalent to 
the CAT critical category. 

The number^'and perc^e'ntage of Items under each objective 
are presented by test' in Table^66 • The greatest difference 
In content between the two tests la In the number oS Items 
covering literal meaning* The MAT has over three times as 
many' Items as the CAT* A second difference between the .tests 
Is that the CAT hfs over twice as many critical thinking ltemt< 
as thf MAT* Assuming that CIP reading instruction focuses 
more on teaching ctudents to grasp the literal meaning rather 
thaii the implications o^f ^at they read, this analysis Indicates 
that the MAT would be the mor^ appropriate test to give* 

A similar typo of comparison was made betv^en the Mathe- 
matics subtesca of the CAT '77 and the MAT '784 as shown in 
-Tables 67 and M? ^ '^he CAT ^ifferiJ two separate subtests: Mathe- 
matics ComputatioTii; and^ Matheiatics' Concepts and Applicatic&s . 
The kAT has placed both types of items in a single subtest. 
Concept^^and applications problems^ are the first 32' items and 
cosputation emblems are the tr^aining 18* 

Tha two are slcllar In all areas except the number 

qf computatlc ptoblenis invnlving fractions and decimali^, 
geometry and mea^vuras^^nt, and numc^tatlon* The difference can 
be attributed to tbe fact that clist CAT has 35 more Items than 
the MAT, and thiy are distributed over thes^ thr^se objectives. 
Although the MAT is a shorter teit; it is trliaimsd by its pub- 
lishers td be as reliable as the other major achievement tests. 



Coi^clusions 0 

Either the CAT '77 or the MAT '78 vcuid be suitable for 
use in the evaluation of the CIP*' Only one test can be selected. 
After detfiled review of both tests, tha Vu^J '78 was chosen 
ovur the C^T '77* The reasons for this decision ara summarized 
below* 



129 



146 



; ^ / Table 64 

iwi '78 Advanced Level 1, Form JS, Reading Co!i5)rehen8lon 
Te^ Items Grouped by Instructional Objective and by Passage 

/ ~' - -Uttral Inf rtntlml 

ZAUUI YSSiki SUfilUfi SlBiUi^ SDtciflc C M^r^l Kvluativ 

• 1 6 I 5 2.3 4 

2 7^3,10 9 12 U 

3 18 14.15,17 13 16 

4 19 21,23 20,22 24 

5 30 25,27 26 . 28,29 

6 32,34 31 33.35 36 

7 41,42 39 38 37,40 ^ 

8 48 43,44,46 45 47 
- 9 — -,49;50,52.^ 51,55 53,54 



Table 65 

CAT '77 Level 18, Form C, Reading Comprehension 
Test Items Grouped by Instructional Objective and by Passage 



JSfilki UltUl Inf rpftlv Critical 

Syn* IftcAll In£trrttd' Charaectr nturaclvt ^chor P«r- 
iMUn Ibllii 2£jAfiil HualU AnalY»l« Lantuag* Ac t. mAaion 



31,34 34 32,33,35,37 

38-40 
41-433 

44-50 

51,52,54,56 53,55,57 

58-63 

64 65,67,69 66,68,70 

•0 1-20,21-25, 
26-30 



Table 66 

Numbex^ and Percentage of Items Under Each Objective 



obi«t^v« V 




MAT '78 




at '77 


VoeabuXary 




/ ■! 


30 


43 








7 


10 


XaftrtQClaX/Xactr^rtelvt 


19 




21 


30 


ev«liuclvt/Crlcieai\ 


5 


9 


12 


17 


^ Total 

» i-- — 


55 


100 


70 


100 



Vot#t CAT 77 ha« « total of 70 Ittsa, iaeludittf Mparatt tubttttt 
for vocAbttl«r7 A&d rtAdl&g co«prohtn«lon* 
MAT '78 hit « totAl of 55 ItMU, vocatulary tad rMding coa- 
prthttttiott ItMiii «rt togtthtr in « tlBglt tubtttt* 



/ 



147 



Table 67 



, HAT 'Tfr—Advanced Level l^om JS, Mathematics 
Itea Muaber and Nuaber of^It^ems Under Each Objecti/e 



Itea Huttber 



Number of Items 
Measuring 
Objective 



Grsphf^ Ststlstlcs 
FriMrCloas ^ k DscImIs 
MVS ft Fropertles 
Vtiolt Muabers 
?roblsii Solving 
CeoMtry k Measurement 
Mtimeration 



30»31,32»25, 26,27 

41-50 

15-18 

33-40^ 

1-6 

19-24, 28, 29 
7-14 . 



6 

10 
4 
8 
6 
8 
8 



S 

Tiibie ee 

CAX-lI7=?=L«vel-18i Fom C, Katheaatics Coaput^tiona and 

_ Kathaaatica Concepta and Applicitiona «L 
It^ Nuaber and Nuabe^ of Itaaa Under Each (^J active 



Itea Number 



/ Number of Items 
Measuring 
Objective^ 



Graphs 6 Statistics 
(Vunc tibas' & 6r apbs) 
ftactidiiui' 4: Daciaals 

iavs Oropertieo 
Oii^th (^piitation) 

VSola: Nuabers — ^ 

(Math Coaf utation)\ 
FroblM Solving 
(Story Frobleis) 
Ceoaatry 4 Meas'ureaent 

Numeration 



55,59,«6,83 


4 


l,2,*,8,i,lO,U,15,19,20, 


28 


21-24, 26,27, 29,30,3\*40 




13,18,25,28 _ 


4 


3,5,6.7,11, 12, lj6,17 


8 


53,65,70,75,76,77,78 


7 


45,46,48-50,58,60^72,73, 


16 


78-80,82-84 / 




41-44 , 47 , 51 , 52 , 56, 57 ,62-64 , 


18 



NOTE t The objectives in parentheses are the labels used by the 
publisher of CAT '77. 



131 



148 



J 



The prlsuiry disadvantage of the Metropolitan Is that Its 
content appears less Interesting tnan that of the CAT '77 and 
^ a result of this, Interns may not be as motivated to take 
and completl^..^th^ .teat* However, the test Items of the CAT 
* ^77. Include a '.greater number of h^gher«-level thinking questions 
"than 'the MAT-^'TS. Compared to the MAT, the California has 
a much larger proportion of test Items that require the reader 
to make an evaluation or critical Interpretation of a pasvegn. 
The Metropolitan Achievement Test, In contrast to-the California, 
lias, a much larger proportion of test Items that require the 
' reader to make a llter«il Interpretation. Whereas the CAT pass- 
ages nfiy be more entertaining to read than the MAT's, the test 
queatlons are more difficult. 

A second difference between the two testis Is the w|j|^ in 
which the test Items are ordered. The questions abou£ any 
one passage of the CAT are likely to come from one c/tegfory 
of Instructional objective. For example. In the CATf all of 
the questions about passage. 3 concern critical thinking^ and 
all those about passage 4 concern figurative- language. In 
the MAT, test questions or a single passage always cover more 
than one instructional objective. For example, the questions 
for passage 3 cover vocabulary and literal, inferential, and 
evaluative thinking. A student taking the CAT who finds it 
difficult to respond to questions that require critical thinking 
may miss all the items about one passage and may. become discour- 
aged about attempting more'' items. If the same student were 
to take the HAT and were to incorrectly answer similay typfis 
oi items, the errors will be scattered throughout the test. 
The arrangement of the HAT test items would seem superior to 
that of the CAT. 

An additional advantage of the HAT that has not been ea«- 
phasized is that it requires less time to administer/ The 
HAT reading subtest %akes 35 minutes com|>ared to 45 for the 
CAT; the HAT mathematics subtest requires 40 minutes versus 
60 ninutes for the C^T. 

The HAT also fulfills the other criteria that were listed 
at the beginning of the paper. It has empirical norms for 
October and April for grades 9, 10, U, and 12. It is con- 
structed so that the level of test that is appropriate to the 
functional, level of the students can be administered and it 
Is still' possible to compare their test performance to that 
of grade-'level peers. 



CAREER DEVELOPMENT INVENTORY 



FORM X 



RMC Research Cdrp. 
25^70 W. El Caralno Real 
Mo\intaln View, CA 94040 
415/941-9550 



134 




J5i 



aRBER DEVELOPMENT INVENTORY 

FORM I 

DONALD E. super; ET AL. 

TEACHERS COLLEGE, COLUMBIA UNIVERSITY 
,NEW YORK, NEW YORK 

COPYRIGHT 1971 



TNTRODUGTION • 

Tht questions you arc about to read ask you about school, 
work your futur" career /and some of the plans you may have made. 
?hronirJllht aniwe^s zrt the ones which are right for you. Later, 
I5«e SiestlSns «k ibout career facts; others afV you to judge 
students' plans. Give the best answers you can. 



Answers to Questions like these can help teachers and 
NAME GRADE \ ^ DAfE_ 



YOUR FtmJUE OCCUPATION 

In your present thoughts and plans, what kind of work 

?^Sckd?lveJ: etcl V,rite the namefs) of the occupation(s) you 
have* thought about on the^ lines below. 



1st. choice^ 
ind' choice 
SirdI choice, 
chol^e^ 



Thi questions begin on the next page. Mark then according 
to the Instructions at the-tflp of each section. 



135 



HO. -uch mnkln, .nd PX^ninf^h.v. yc« dc„. .bout your .duc.tion.l p 
occupttiontl fttturt? What kinds of pun. aoyo ,„„ers to »how wh»t. 

ii si^our.n;w2?\f 'tL'spVcrtrth. uft of ..ch st.t,..nt. . 

r 

Hare are tht possib le answers: 

1 -I have not given «ny thought to this. 

I .1 have given some thought to this, but haven't •ade any plans 
to do this. 

3 -I have 5o.e pUn5 W do this, but a» 5tiU not sure of thes. 

4 -I h.v. Mde definite plans to do this, but don-t know how to 
. -carry the» out. 

5 -X have «ade definite plans to do this, and know what to do to 

carry the* out. ^ 
6*1 have done this. 
H#re are the stateaents : 

r°lki"t to sowe body who knows about the possibilities. 
0^ Z. Talking about .y career decisions. with an adult who knows 
soBe thing about *e. 

3 Taking courses which will help «e decide what line of work 
^" iogS into when I leave school or college. 

4.. Taking courses which will help «e in college, in job train- 
.ing. or on the job. 

S Taking part in school or out-of-school activ-ities Which will 
h"p me 1" college, in training, or on the job. 

work to go into when I leave scho(\l. 

7 Getting a part-time or summer. Job w^ch will help me decide 
■ what kind of work I might go into. 

8 Getting a parftime summer job which wi^ll help me get the 
kind of job or training I want. ^ \ 



153 



Htre are tht possibit answers : 

1- l have not flvtn any thought to this. 

2- 1 have given so»e thought to this, but haven't Made any plans 
to do this.' 

3- 1 have so«e plans to do this, but an still not sure o£ then. 

4- 1 have made definite plans to do this, but don't know how to 
carry the* out. 

5- 1 have aade definite plans to do this, and know what to do to 
carry th«A out. 

6- 1 have done this. 
Here are so«e aort statements ; 

9. Getting noney for college or training. 

10. Dealing with things which might »ake it hard for me to get 
r the kind of training or the kind of work I would like. 

11. Getting the kind of training, education, or experience which 
I will need to get into the kind of work I want. 

12. Getting a job once Tve finished «y education and training. 

Doinf the things I need to do to become » v*^!;*** ««'Pi?>;" 
who doesn't have to be afraid of losing his job or being 
laid off when times are' hard.- 



13. 



14. Getting ahead (more money, promotions, etc.) in the Jcind 
of work I choose. 



IS. How would you rate your plans for *^fter high school"? (Please 
check ( ✓) one answer.) 

a. Not at all clear or sure 

b. ^ Not very clear • 

c. Some not clear, some clear 

d. Fairly clear 

e. Very clear, all decided 



137 154 



Students difftr grtatly In tht •mount of tlM and thought they glvt 
to fkini choice* . Uf« tht flvt ratings btlow to compare yourse lf 
to the typical students of your sex in your grade m each ot tne , 
areas of cliQice listed below. Mark the number of vour rating in the 
space provided in each statement* 

Here are the ratings ; ' 



1 - much below average » not as good as most^ 

2 * a little below average 

3 • average 

4 > « little above average 

5 - much above average^ better than most 
Here are the statements : 

16* Compared to my classmates J am in the amount of time and 

thought I give to choosing high school courses * 

17. Compared to my classmates I am in the amount of tine and 

thought t give to choosing high schoo l *<^tivitie<- 

IS. Compared to my classmates I am ' in the amount of time and 
thought I give to choosing out*on7chool activities . 

19. Compared to my classmates I am in the amount^ of time and 

thought I- give to choosing amoung teneral alternatives avail- 
able to me aftef%igh school (for example: choosing college or 
business school or technical school or work or military service 
or marriage, etc.) 

20. Cbmpared to my classmates I am in the amount of time and 

thought I give to choosing among specific alternatives avail- 
able to ae (for , example: type of college* branch of the niii* 
tary seifvice, characteristics of husband or wire, etc.) 

21. Compared to my classmates I am in the amount of time and 

thought I Kive to chop's ing ah occupation for after high ichool . 
college or job training . 

22. Compared to my classmates I am in the amount of time and 

thought I give to choosing a career in general . 




138 



in. Hew wich do you Imow about tht occupation you $*ld you would fiost 

mt to tnttir on-pMt cnt of this iSvtntory. Below .re ^ive possible 
iinswers to use in snswirTng ststewetits. 23 though 33. Msrk the num- 
ber of your answer in the spsce provided In escn statement. 

Here are the answers : 

I'-^'^hardly anything 

2 ' a little \ 

3 \' an average a»ount 

i 

4 * a good deal 

5 * a great deal 
Here are the statements : 

23. I know about what people really do on the Job I said I 

would lITF'to enter. 

' 24. I know about specialities In the occupation I said I would 

like to enter. 

25. iTtnow about different pUces where people might work in 

this occupation. 

26. I know ^ about the qualifications and skills needed for this 
occupatloA 

27. I know about the environmental* working conditions in this 

occupttTonT ' 

28. I know about the education or training needed to get into 

""this occupatfon. . ^ 

29. I know about the courses offered In high school that are 

the besFTor this occuj^atlon. ^ 

to. I know about the need for more people In this occupation 

3j. 1 know about different ways of getting Into this occupation. 

32. I know ^ about the starting pay in this occupation. 

. 33. I krtow ^ abo^t the chances for getting raises and promotions. 



" 15C 

139 



• . / 

I ' 

jl 

IV. fhftt sources of Infomiation would you go to for htlp in Mklni your 
Job or coUegt plans? Uie~tht five posslblt answers listed bSlow to 
show whether pr not you would fo to the sources of Inforsatlon listed 
below* Mark the nu»btr of your answer "in the space provided In each 
. state««nt. 

Here are the afiswers ; 

1 - definitely not 

Z - probably not 

3 * not be sure whether to 

4 • probably 

5 * definitely ^ 
Here are the statements; 



34. 


I 


would 


go to my father or male guardUn. 




3S. 


I 


would 


go 


to my mother or female guar<' nn. 






I 


would 


£0 


to my brothers, sisters » or ^ .her* 


relatives. 


37. 


I 


would 


£0 


to my friends. 




38. 


I 


would 


£0 


to coaches of teams I have been on 




39. 


I 


would 


. go 


to my minister, priest, or rabbi. 




4*0. 


I 


would 


£0 


to teachers 




41. 


I 


would 


go 


to school counselors. 




42. 


I 


would 


£0 


to private^ounselors, outside of 


school. 


43. 


I 


would 


go 


to books with the information t need. 


44. 


jt would . fio to audio or visual aids like tape 
novies or computers. 


recordings , 


45. 


I 


would 


£0 


to college catalogues. 




46. 


I would fiO 
am considering. 


to persons in the occupation or .at 


the college 


47. 


I 


would 


go 


to TV shows, movies, or magazines. 





I 



140 -^57 



Mtrt •ttm sr. flvt ut%w* which ar. to b« u?.d -ijh ftat.«nt» 
.! V * u 1\ J^Tfci. timm usa tht «nswer» to show which ol tne 

mtnt. ' ■ 

Here >r» th» »nswTs : ' 

1 - no useful Infomttlon 

2 - very little uftful InforBetiorl 

3 . somt useful infometion 

4 - • lood deel of useful Information 

5 - • yreet duel of ureful InforMtlon 
Here are the stetewents : 

41. . 1 have gotten fro. «y father' or «ale guardian. 

49. 1 have gotten fro. .y -other or female guardian. 

50. 1 have gotten fro. «y brother,, sisters or other relatives 

51. 1 have gotten : fro. .y friends. 

S,2. 1 have gotten fro. coaches of teaas I have been on. 

SJ. I'have gotten fro. my .Inlster. priest, or rabbi. . 

S4. 1 have gotten fro. teachers. 

S5.. 1 have gotten fro. school counselors. 

56. 1 have gotten «T0« private counselors, oufilde of school. 

57. I have gotten fro. books -1th the lnfor.atlon I needed. 

58. I have^gotten fro. audio. or visual .ids like tapes re- 

corrti'ftgS. .ovlesTor co.putefs. 

1 have .gotten fro. college catalogues 

1 have gotten „ from pevsons In the occupation. ot atjthe 
college 1 a. considering. ^ 

61. 1 have gotten fro. TV shows, .ovles".' or aagailnes.- 



S9 
60 



lAi • 158 



Here ..ch question h.s its own set. o£ possible answers. Check i^) 
VnlY Sne ins-er for each question. 

.lehich one of the following is the best source o£ Information 
aboSt Job' duties and opportunities? 
• TK« Encvcloy ^^t> iritannlca 

Z) World Almnac 

S) Scholastic Matatine 

4) T ht Occup»^lQ"*i Index ^ ^ 

%) ' fiiilvnipiTH-^ OitMQok Handbook 

I) TK# World ^o '^^ Encyclopedia. 

25 wMi^ter's Co ^'^g'*^** Dictionary. — 

3) tnvi> }Qy's Co llege Guide 

4) grader's Digest 

S) the Educat ion Index 
' * * w r ii«w4ni>- nalrs of occupations involves the 

^ I) Tailor, Sales Clerk 

2) Engineer, Banker 
Tailor, Engineer 
4) Banker, Sales Clerk 

" . t iri.iMc #xoected to grow most rapidly during 

6S The occupational fields expecteo lo n 

the next ten years ^ 
I) Professional and service 
• 2) Sales and crafts 

3) Crafts and clerical 

4) Labor and sales 



142 



15 V 



66 Betw..n 1910 «nd 1970. the Industry e.vloylng the greatest 
nunbtr of worktrt changed frog: ' 

D^Arirlculture to wholesale and retail trade 

2) Manufacturing to airlculturo 

3) Wholesale and retail trade to manafacturlng 

4) Aftlcultiire to manufacturing. 

VII. occupation. roreducaUSS "reVuirerl S" llTo'^ll 

left of each statenant* 
T ype of Education : 

1 • High School' Graduation 

2 • Apprenticeship Training 

3 - Technical School or Community College (2 year) 
^4 - College Degree (|^«ar) 

5 . Profasslonal Degree Beyond College 
Occupatlc.<s_ : 

67. Stenographer 

6B. Dental Technician 

69. Family Doctor (Physician) 

70. Mall Carrier ' 

71. Plumber 

72. computer Operator 

73. Bank Clerk 

74. Social Worker . 



1^3160 




tht sp«C0 to tht Itft of tht occupation. 
Typt of Equip— nt t 

1 • Mwiikin 

2 • AwMttr 

5 • 'Ctntrifugt 

4 • Trowtl 

5 • Ltdgtr ' ^ 
T ype of Occupations ; 
75. Eltctrician 

- 76. l6okkttp«r. 

77. Iricklaytr 

. . 7t. Orefswktr 

79. Mfdical Ttchnician 

IX. Htrt again, trch qutstion has its own stt of answtrs. Chtck (v^^ 
only ont answer for tach qutstion. 

SO. In th« 9th and lOth gradts, plans about Jobs and occupations 



should: 




1) be cloar. 

2} not ruU out any possibilitir?. 

') keop^ op«n the best possibilities. 

4) not be^soKethin^to think about. 



144 



SI. Ofcitiont tbout high school courses can have an effect on: 
y — ^ ^) ^^^^ 0^ dipio«a one gets. 

\ 2) the kind of training or education one can get after, high 

\ ' school - I 

\ 5) later occupation choices. 

4) how mich cne likes school. 

5) all of these. j 

82. Decisions about jol>s should take into account: 

D' strengths, or what one is good at learning and doing. 

— 2) what one likes tjo do. 
3) the kind of persjon one is. 

^ 4) the chances for getting ahead in that kind' of job. 

— — 5) all' of these. j 

^ t3. One of the^ things that grea: artists » Musicians, and professional 

' athletes have in coimon is '.he desire to: 

^ Z) have large audleijces. 

3) be the best there is at what they do. 

.... 4) teach others what! they do. 

14. Hary thinks she aight like to becoae a computer progrtM«er» 

but she knows. little about computer programing. She is going 

to the library to find out nore about it. The most iaportant 

thing for Mary is know now is: 

I) whit th^ work is, what she would do in it. 

2) what the pay is. , * 

3) what the hours of work are. 

4) where she can get the right training. 



Jtnt llkts htr high school biology and gonertl scUnct courses 
btst. Sht llkts to do her schoolw&rk alone so she cen concen- 
trate. When she begins to think about her future occupation, 
she should consider; 

1) Nurse. , 

2) Accountant. 

3) Medical Laboratory TechnlcUn. 

4) Elementary School Teacher. 

Peter Is the best speaker on the school d-^batlng team. The school 
yearbook describes hl^as "our golden tongued ^"i "i^* 

guy who can listen asftell as t»lk--hc could sell f J^'^iSJJ'^";, . 
to the Eskimos." Peter will probably gradut/e ^J*?^. J^t^ 

of his class, although his test stores show that he Is veVy bright 
Hls'only good grades (mo*>tly B's) are in business subjects. His 
poorest g?ade5 *re In Hugllsh^nd social studies (mostly C's)^ 

Peter'* cleslre to become a trial lawyer Is not very realistic 
because: ' r 

1) with his grades he will have difficulty getting Into a four- 
year liberal arts college. * 

2) he has poor grades* In the subjects that are most Important 
for law. 

3) there -Is much more to being a lawyer than being good at pK)- 
llc speaklitli- 

all of ^he above'^re tood reasons for thinking that Peter 
will have a hard time becoming a trial lawyer. 

The i\^cts about Peter suggest that he should think about becoming: 

1) an accountant. ' 

2) % salesman. 

3) an actor. ^ 

4) a school counselor. J 

5) a lawyer. \ 



146 



163 



Ernie took soae tests which show that he sight be good at 
cltrica; work. Ernie s«ys, "I Just can*t see a/self sitting 
behind a dbsk ^or tht rest of my life* Ta tht kind of luy 
who likes vafiety* I. think being s traveling salesman wbtrld 
suit mt tint*** He shcjuld: 

1) disregard the teits and do what he wantG to do* 

2) do what the tests say since they know better than he does 
what he would be good at* 

■»\ 

5) look for A job which will let hin use ^is clerical abilities 
but not keep hin pinned to a desk* 

4) ask to be tested with another test since the results of the 
the first one are probably wVong.' 

Joe is very good with^his hands and th#re isn*t anybody in his 
clastf'who has aore Mechanical aptitude* He is also good at 
art* His best subject at school li aath* Joe llVes, all of 
thefe things* 

What should Joe do? Should- he: 

1) look forlln occupation in which he can use as many of his 
inte'^esoitnd abilities as possible? 

Z) pick an occupation which uses «ath since thej^e is 9 better 
futurt in that than in art or in working with his hands? 

5) decide which.cf these'activitits he is best at, or likes the 
MSt, acd then pick an- occupation which uses that kind of 

" activiffP ^ "4 

4) put off deciding about his future and wait until he loses 
interest in sone of these activities? 

Betty gets very good science grades but this isn't her favorite 
subject* The subject she likM best is an even* though her 

Jrades in it tre only average* ' Betty is Host likely to do. well 
n her future occupation If she: ' ^ 

1) for gets about her interest in art since she is so nuch ber.tf» 

3r^^~«__^ " I 

2) doesn't worry about the fact ^thrrtKe-l^alt^yrry good at art 
because if you like so»ething you can become good-at^.U. 

5) looks for an occupation which uses both art and science, 
but siore science than art* 

4) looks for an occupation which involves botl/science and »rt. 



but noM art than science* 



147 

164 



Bob says ht really dotsn't c»r^ what kind of work he gets into 
once he leaves school as long as It Is working with people. If 
this is all Bob cares about he Is likely to nake a bad choice 
because: 

1) this kind of work usually requires a college degree, 

2) enplbyers usually hire slrls for such work. 

3) people look down on men who work with people bcccuse such 
work Is usually done by girls. 

4) occupations In which one works with people can be very 
different from each other In the abilities and Interests 
which are needed. 



t 



\ 



COOPERSMITH 



SELF-ESTEEM INVENTORY 



RMC Research Cotp. 
2570 W. El Camino Real 
Mountain View, CA 94040 
415/941-9550 



149 



FRACTiCE ITEMS 

A. Ilikf towatchXV. 

B. I'm • 90od worker. 



*♦ 

LIKE ME 
LIKE ME 



NOT LIKE ME 
NOT LIKE ME 



1. fsp«nda!otof 

timt daydrMming. LIKE ME , 

2. I'm prttty sure 

ofmyseif. LIKE ME . 

3. I often wish I were 

soineone els*. LIKE ME . 

*'4. I'm easy to like. LIKE ME . 

5. My partnts and I have a lot 

of fun together. LIKE ME . 

6. I never worry about cnything. LIKE ME . 

7» t find it very hard to 
talk in front of the 

class. LIKE ME . 

8. I wish I were younger. LIKE ME 

9. There are lots of things 
about myself I'd change 

if I could. LIKE M^E 

to. I can make up my 

mind without too 

much trouble. LIKE ME 

11. I'm a lot of fun to 

be with. LIKE ME 

12. I get upset easily at home. Lir^E ME 

13. I always do the ri^t thing. LIKE ME 

14. I'm proud of my 

school work. LIKE ME 

15. Someone always has to 
teltmev^todo. LIKE ME 

10. ittakwmaalong 
time to get used to 

anything new. LIKE ME 



NOT LIKE ME_ 

NOT LIKE ME . 

NOT LIKE ME ^ 
NOT LIKE ME. 

NOT LIKE ME. 
NOT LIKE ME. 

NOT LIKE ME. 
NOT LIKE M . 

. NOT LIKE ME . 

. NOT LIKE ME. 

. NOT uIKE ME . 
. NOT LIKE ME . 
. NOT LIKE ME . 

. NOT LIKE ME . 

. NOT LIKE ME 

. NOT LIKE ME 



(33) 

(34) 

(35) 
(36) 

(37) 
(38) 

(39) 
(40) 

(41) 

(42) 

(43) 
(44) 
(45) 

(46) 

(47) 

m 



150 

167 



LIKE ME 



NOT LIKE ME 



LIKE ME . 

LIKE ME . 
LIKE ME . 

LIKE ME 
LIKE ME 

LIKE ME 
LIKE ME 

LIKE ME 

LIKE ME 
LIKE ME 



NOT LIKE ME 

NOT LIKE ME 
NOT LIKE ME 

. NOT LIKE ME 
. NOT LIKE ME 

. NOT LIKE ME 
. NOT LiKcME 

. NOT LIKE ME 

, NOT LIKE ME 
NOT LIKE ME 



LIKE ME NOT LIKE ME 

LIK€ME NOT LIKE ME 

LIKE ME NOT LIKE ME 



LIKE ME 
LIKE ME 

LIKE ME 
LIKE ME 

LIKE ME , 

LIKE ME 

LIKE ME . 



NOT LIKE ME 
NOT LIKE ME 

MOT LIKE ME 
NOT CV^E ME 

NOT LIKE ME 

. NOT LIKE ME 

.NOT LIKE ME 



151 



168 




38. lh«^ftlowoplnk)nofmyMlf. LIKE ME NOT LIKE ME (33) 

39. I don'tlikttob^with 

oth«rpfoplt. LIKE ME NOT LIKE ME 134) 

40. Thift art many timtf whin <^ 

I'd likt to liwthomt. LIKE ME NOT LIKE ME (35) 

41. I'm ntvtf shy. LIKE ME NOT LIKE ME (36) 

42. I ofttn ftti upMt in school. LIKE ME NOT LIKB'ME (37) 

43. - I ofttn fMl Mhimtd of myself. LIKE ME NOT LIKE ME (38) 

44. i'm not at nkt lookin9 

m most peopla. LIKE ME NOT LIKE ME (39) 

45. I f I havt somtthing to say, 

I usually say it, LIKE ME NOT LIKE ME — (40) 

46. ICWi pick on m« vary offan. LIKEME*^ : NOT LIKE ME (4t) 

47. Myparantsundarstandma. LIKE ME NOT LIKE ME (42) 

48. I always tail tht tnjth. LIKE ME NOT LIKE ME (43) 

48. My taachar makas ma f att 

that I'm not good anouflh. LIKE ME NOT LIKE ME : (44) 

50. I don't cara what happans 

to ma. LIKE ME NOT LIKE ME (45) 

51. I'mafailoft. LIKE ME ^ NOT LIKE'ME ^46) 

52. I gatupsataasily whan 

• I'm scoWad. LIKE ME NOT LIKE ME (47) 

53. Most paopla are battaflikad 

" than I am. LIKE ME NOT LIKE ME (48) 

54. t usually faal a$ if my 

paramsara pushing ma. ^ . LIKE ME NOT LIKE ME (49) 

55. I always know what to uy 

topaopt«. LIKE ME NOT LIKE ME (50) 



56. i oftan gat discouraged 
In school. LIKE ME NOT LIKE ME (51) 

V 

57. "Things usually don't bothar me. LIKE ME NOT LIKE ME (52) 

58. I can't ba dapandad on. LIKE ME . NOT LIKE ME (53) 



This bookitt wQf pt%par9fi by SMC Research Corporation. Mountain View. Coliforn^a 
for use under National Institute of Education Contract No. N)e«400«78*0021 



16!^ 



Study of the CAREER INTERN PROGRAM 



TNTERNAL-EXTERNAL SCALE 



RMC Research Corp. 
2570 W. El Caalno Real 
Mountain Vlev, CA 94OA0 
415/941-9550 



153 



NAME 



IhfTERNAL-EXTERNAL SCALE 

ft 

DATE 



DIRECTIONS : 

The purpose of this short task is to determine how you feel about 
certain things. 

Read each of the following paired statements. Which of the two 
statements do you agree with more? Circle that letter . Choose only 
one. (However, be sure to choose one ; of the paired statements for 
each item). \. ^ 

Example: l.a. Most children should be punished by their mothers, 
b. A child knows when he does something wrong. 



1. a. Children get into trouble because their parents punish them 

too much. 

b. The trouble with most children nowadays is that their parents 
are too easy with them. 

2. a. Many of the unhappy things in people's lives are partly due to 

bad luck. 

b. People's misfortunes result from the mistakes they make. 

3. a. One of the major reasons why we have wars is because people 

don't take enough interest in politics, 
b. There will always be wars, no matter how hard people try to 
prevent them. 

4. a. In the long ran people get the respect they deserve in this 

^. world. 

b. Unfortunately, an individual's worth often passes unrecognized 
no matter how hard he tries. 

5. a. The idea that teachers are unfair to students is nonsense. 

b. Most students don't realize the extent to^hich their grades.^ 
are influenced by accidental happenings. 

6. a. Without the right breaks one cannot be an effective leader, 
b. Capable people who fail to becojjje leaders have not taken 

advantage of their opportunities. 

7. a. No matter how hard you try some people just don't like you. 

* b. People who can't get others to like them don't understand how 
to get along with others. 



171 



8. a. Heredity plays the major role in determining one's personality, 
b. It is one's experiences in life which determine what they're 

like. 

9. a. i have often found that what is going to happen will happen, 
b. Trusting to fate has never turned out as well for me as making 

a decision to take a definite course of action. 

10. a. In the case of the well prepared student there is rarely if 

ever such a thing as an unfair test, 
b. Many times exam questions tend to be so unrelated to course 
work that studying is really useless. 

11. a. Becoming a success is a matter of hard work, luck has little 

or nothing to do with it. 
b. Getting a good job depends mainly on being in the right place 
at the right time. 

12. a. The average citizen can have an influence in government 

decisions. 

b. This world is run by the few people in power, and there is not 
much the little guy can do about it. 

13. a. When I make plans, I am almost certain that I can make them 

work. 

b. It is not always wise to plan too far ahead because many 

things turn out to be a matter of good or bad fortune anyhow. 

14. a. There are certain people who are just no good, 
b. There is some good in everybody. 

15. a. In my case getting what I want has little or nothing to do with 

luck. 

b. Many times we might just as well -decide what to do by flipping 
a coin. - - ^ 

16. a. Who gets to be the boss often depends on who was lucky enough 

to be in the right place first, 
b. Getting people to do the right thing depends upon ability, luCk 
has little or nothing to do with it. ' 

17. a. As far as world affairs are concerned, most of us are the 

victims of lorces we can neither understand nor control, 
b. By taking an active part in political and social affairs the 
people can control world events. ^ 

18. a. Most people don't realize the extent to which their lives are 

controlled by accidental happenings, 
b. There really is no such thing as *'luck." 



155 

172 



19. a. One should always be willing to admit mistakes, 
b. It is usually best to cover up one's mistakes. 

20. a. It is hard to know whether or not a person really likes you. 

b. How many friends you have depends on how nice a person you are. 

2i;a. In the long run the bad things that happen to us are, balanced 
by the good ones, 
b. Most misfortunes are the result of lack of ability, ignorance, 
laziness, or all three. 

22. a. With enough effort we can wipe out political corruption. 

b. It is difficult for people to have much control over the things 
politicians do in office. 

23. a. Sometimes I can't understand how teachers arrive at the grades 

they give. 

b.^^TRere is a direct connection between how hard I study and the 
grades I get. 

24. a. A good leader expects people to decide for themselves what they 

should do. 

b. A good leader makes it clear to everybody what their jobs are. 

25. a. Many times I feel that I have little influence over the things 

that happen to me. 
b. It is impossible for me to believe that chance or luck plays 
an important role in my life. 

26. a. People are lonely because they don't try to be friendly. 

b. There's not much use in trying too hard to please people, if 
they like you, they like you. 

27. a. There is too much emphasis on athletics in high school, 
b. Team sports are an excellent way to build character . 

28. a. What happens to me is my own doing. 

b. Sometimes I feel that I don't have enough control over the 
direction my life is taking. 

29. a. Most of the time I can't understand why politicians behave the 

way they do. 

b. In the long run the people are responsible for bad government 
on a national as well as on a local level. 



156 



170 



APPENDIX D 

The Correction for Guessing: 
Valid and Invalid Applications 



V 



157 



174 



The/purpcse of this appendix is to attempt to clarify Issues 
concerning application of the so-called correction for guessing — 
particularly as that correction was employed in the Gibboney 
ilussoclates (1977) vwaluation of the Career Intern Program. 

Tickelman C1971) provides an excellent discussion of the 
correction for guessing* As he points out, if a test taker 
responds in a purely random fashion to k test items each of which 
has n choices, the expectation is that he or she will answer k/n 
items correctly and k - k/n items incorrectly* If one assumes 
that all of the items answered incorrectly, W, were items which 
the rj^ondent ansv^ered randomly, then W " ]c - k/n* It follows 
tha^%/n « W/(n - 11). Since the total number of items answered 
correctly, R, is made up of the item£^^o which the respondent knew 
the answer plus those which he or she got right by random guessing 
(k/n), the number' items to which the respondent knew the answer 
is given by R minus the correction for guessing W/(n - 1). 

What is important to note' about the correction for guessing 
is that it is mathematically correct only when respondents answer 
correctly all items to which they know the answers and perform in 
a random fashion on all other items they attempt. As Tinkelman 
correctly points out, when guessing is non^random, the formula 
breaks down** It: does not *'work," for example, if the respondent 
is able to eliminate one or two of the answer choices as defi*- 
nitely incorrect add guesses among the remaining choices, or if he 
of she falls into a trap rigged by the ingenious item writer. It 
also does not worlf, as will be illustrated below, if the respon- 
dent guesses randomly on items where he or she knows the answer. 

In the Gibboney study, the correction for guessing was 
applied to the "raw" test scores because "many of the people in 
the control group were completing the items by pattern responses 
on the answer sheet rather than by solving the problems and 
cho<"sing their answer frop among the distractors" (Vol. II, p. 
16). As additional evidence that random responding occurred, the 
report indicated that (a) the increase in number of reading test 
items attempted from pre- to posttest was greater for the control 
T^an for the CIP group; (b) although control^^group members at- 
tempted an average of 13*7 more -items on the^eading posttest than 
on the ^/etest, the number of items answered correctly increased 
by only .5; and (c) on t\e math test, the percentage of attempted 
items answered cTorrectly increased from pre- to posttest for the 
CIP group but decreased for the controls* 

These facts all suggest that membei^ of the control group 
dld^ in fact, exhibit more random behavior (guessing) than members 
of thv treatment group in responding to the posttest instruments. 
ThG critical question, ae will be seen later, is whether they 
guessed only on items for which they could not have worked out the 



J 



158 



•4 



correct answers or whether they also gu'essed on items they could 
have answered correctly. Under the former condition,, the correc- 
tion for guessing will serve its intended function whereas, under 
-the latter condition, iX-vAU_not^_ in facjtt where ^gue^^ing has 
occurred on items that could^have been answered '^correctly, the 
correction for guessing will distort rather than. correct. It will 
spuriously inflate differences between the guessing and the 
non-guessing groups* 

Consider Ms* Ceebar, who knows the answer to 12 items pn a 
AO-item test but has no idea what the correct answer may be to any 
of the remaining 28 items* If she responds only to those items 
\about which she is knowledgeable, her score, 12, correctly informs 
IS of the number of items to which she knows the answer* If she 
^ad answered tfie^items correctly ^bout which she was knowledgeable 
and had guessed pn the rest, we would expect her to have answered 
12 + 28/4, or 19^ items correctly"; Without a correction for 
guessing, we might mistakenly have assumed that she knew the 
answers to 19 items* If we apply the correction foy guessing, 
however, we learn that, even though she answered- 19 items correct- 
ly, \ she only knew the correct answers to 12 of them (19 - 21/3 « 
12) *\ 

\now suppose that Ms* Ceebar was in a hurry and knew that she 
''had Aothing to gain fron? putting forth her best effort on the 
te8t*\ Rather fchan taking time to read and think about the items, 
she decided to save time and effort by simply marking her answer 
sheet ft random* Under these circumstances she would (if she .were 
average) have answered 10 items correctly and 30 items incor- 
rectly *\ We might mistakenly have assumed, from this^ information, 
that site knew the answer to 10 items (actu?Jly she knew the 
answers i to 12 items)* If we apply the cotrection for guessing 
under tpese circumstances, Ms* Ceebar' s corrected score is 
10 - 30/3 or 0* Accepting this "corrected" score as a true 
indicatio^n of the number of items to whi,ch she knew the answers 
would lead us to a far more erroneous impression of her achieve- 
ment level than acceptance of her uncorrected (but still defi- 
nitely incorrect) score* \ 

Assume that Ms* Ceebar was the averagi^ member of the control 
group and chat she responded to the' pos^test in th§ manner just 
described* Mr* Teabar, who was the average member of the treat- 
ment group, also ^new the answers to 12 items on the test* He 
answered th^se items correctly and guee^sed on the remaining 2$ 
items* His score wa^ 12 + 28/4 or 19* Ms. Ceebar's scbre/was 
ten* Since the treatment effect is measured by subtracting the 
posttest score of the control group's, average mcimber from the 
posttest score of the treatment group 'Sj average member, we 'would 
conclude (erroneously) that the treatmejnt had an /impact of nine 
units (19 « 10 - 9). 1 / 



159 



176 




Suppose we now correct both scores for guessing. Ms; Cee- 
bar's scoce becomes 0 and Mr. Te^bar's score becomes 12. Since 
12 - 0 " 12, we now conclude that the effect of the treatment was 
a 12-point gain. This gain estimate is 33% larger than the gain 
estimate derived frc i 8cor*is chat were not corrected for guessing. 
"Both estimates are infinitely larger than the true gain that is 
obtained by subtracting the nurabf^r of items to whi>2h Ms. Ceebar 
knew the answers (12) from the number of items to which Mr. Teabar 
knew the answers (also 12), 12 * 12 « 0* 

The mathematics of the preceding argument are clear, but the 
argument itself may not apply exactly to the Gibboney Associates 
evaluation. Nevertheless, if there was even one more guess i« the 
control croup on an item the respondent could have answered 
correctly (by applying more tLme or affort) than there was in the 
treatment groi$, some distort Lon was introduced by the "correc- 
tion" for guessing. 

As pointed out earlier, the Gibboney Report provides ample 
and ver;^ convincing evitlcnci^ that there was more random responding 
on the posttest among control group members than among treatment 
group members. The ^ Xa show, In fact, that control group mem- 
bers, who correctly t^nswered 71% of the Items they attempted on 
the pretest, answered ^nly 55% correctly on vMe posttest. The 
corresponding figures xor the treatment group w<»re 68% and 66%, 
respectively. ' 

The Gibboney data also strongly sug^jt^st that some of the 
random responding occurred on items that the respoa;lents could 
have answered correctly if they had made- the effort. This In- 
ference is based on the fact that the control group members 
responded to 13.7 more items on the reading posttest than they did 
on the pr^^nest. By chance alone they should have gotten 3.4 of 
these items correct. Their posttest scores, however, increased by 
only .5 points over their pretest scores, indicating that they 
must have answered 2*9 Items incorrectly on the posttest that they 
had answered correctly o'l the pretest. It seems most unlikely 
that this phenomenon could be tlie result of a real loss of reading 
ability, considering the age of tlie students and the length of the 
pre-to-posttest interval. Thus* vhlle the possible existence of a 
real loss of reading ability must be acknowledged, the probability 
that control group members r<^s:)onded randomly to some Items that 
they could, with more effort, ha/.^ answered correctly seems 
overwhelmingly greater. 

The s I tuat I on appears to be almos t Ident ical to the hypo- 
thetical example presented above involving .Ms. Ceebar* and Mr. 
Teabar. Random responding In the control group produced an 
uncorrected (for guessing) estimate of gain thct was spuriously 




» 

high* Applying the cotrection for guessing, rather than connect- 
ing this problem, actually exacerbated it by making the already 
too«-large estimate even larger^ \ 

The authors feel that the preceding discussion has made a 
convincing case against correcting scores for guessing under 
circumstances suth, as were ^ observed in the Gibboney evaluation* 



4> 



/ 



/ 



REFERENCES 



Campbell, D. T., & Boruch, R. F. Making the case for randomized 
assignment to treatments by considering the alternatives: Six 
ways in which quasi-experimental evaluations in compensatory 
education tend to underestimate effects* In C. A. Bennett & 
A* .A. Lumsdaine (Eds* ), Evaluation and ^xperimients : Some 
critical issues in assessing social programs * New York: 
Academic Press, 1975. 

Campbell, D. T., & Erlebacher, A. E. How regression artifacts in 
quasi-experimental evaluations can mistakenly j|ake compensatory 
education look harmful. In J. Hellmuth (Ed. ) ^Disadvantaged 
child. Vol. 3. Compensatory education; A national debate . ' 
New York: Brunner/Mazel, 1970. 

Cook, T. D. 6f Caicpbell, D. T. Quasi-expayimentation: Design 
and analysis issues foi field settings . Chicago : Rand 
McNally, 1979. 

Coopersmith, S. The antecedents of self-esteem . San Francisco: 
W. H. Freeman 6i Co., 1967. 

Cronbach, L. J., Ambron, S. R. , Dornbusch, S. M., Hess, R. D., 

Hornik, R. C. , gKifllips, .D. C, Walker, D. F., & Weiner, S. S. ' 
Toward reform of program evaluation . San Francisco, CA: Jossey 
Bass Inc. , 198,0. 

Fettfinnan, D. M; Study of t :e Career Intern Program. ^ Final 
technical report— Taslc C; Program Dynamics; Structure , 
Func t ion > and , Inter re lat lonships . Mountain View, CA: RMC 
Research Corporation, 1981. fUR 465] 

Gibboney Associates, Inc. The Career Intern Program; liual 
report . Volumes I and II. Blue Bell PA: Author, 1977. 
. .(j^IE Papers in Education and work) 

Kenny, D. A. A quas i^^experimental approach to assessing treatment 
eff6?^ts in the ncnequivalent control ^ oi:p design. Psychologi- 
cal Bulletin , 1975, 82(3), 345-362. 

National Institute of Education. Compensatory education study . 
Final Report. Washington DC: Author, 1978. 

National Institute of Education. Request for proposal: Study of 
the Career Intern Program . Washington, D.C.: Author, x978. 
[RFP NlE-R^78-0004] 

Porter, A. C. The effects of using fallible variables in the 
analysis of covariance . Unpublished doctoral dissertation. 
University of Wisconsin, 1967. t 



Raven, J. C. Matrix tests. Mental Health, 1940, I, 10-18. 



Rotter, J. B. Generalized expectancies for internal versus 

external control of reinforcement. Psychological Monographs , 
19.66, 80, No. 609 

Sarjetsky, G, The GEO P. C. experiment and the John Henry effect. 
Phi Delta Kappan , 1972, 579-581. 

Super, D, E. Career development. In J. R. Davitz & S. Ball 
(Eds . ) ,* Psychology of the educational process ^ New York: 
McGraw-Hill, 1970. 

Tallmadge, G. K. Cautions to evaluators. In M. J. Wargo & D. R. 
, Green (Eds.), Achievement testing of disadvantaged and minor-ij-y 
students for educational program evaluation . Based on the 
proceedings of a U.S. Office of Education Invitational Confer- 
ence. CTB/McGraw-Hill, 1978, 331-384. 

Tallmadge, G. K. An empirical assessment of norm-referenced evalua- 
tion methodology . Mountain View, CA: RMC Research Corporation, 
1981. ' 

Tallmadge, G. K., & Horst, D. P. A procedural guide for validat- 
ing achievement gains in educational projects . Washington, 
D.C.: U. S. Government Printing Office, 1976. (Stock No. 
O17-08a-:015l6; 

Tallmadge, G. K., & Wood, C. T. User's Guide: ESEA Title I 
evaluation and reporting system . Mountain View CA: RMC 
Research Corporatijti, 1976. 

' I 

Tallmadge, G. K., & Yuen, S. D. Study of the Career Intern Program. 
Interim technical report No. 2 — Task B: Assessment of intern 
, outcomes^ Mountain View, CA: RMC Research Corporat iipn , 1980. 
[UR 462] [ 

Thomas, T. C, or Pelavin, S. li. Patterns in ESEA Title I read - 
ing achievement . Menlo Park CA: SRI International, 1976. 

Tinkleman, S. N. Test design, ^construction, administration, and 
processing.^ In R. L. Thorndike (Ed.), Educational Measurement 
(2nd ed.). Washington, D.C. : American Council on Education, 
1971. 

Treadway, P. G. , Stromquist, N. P., Fetterman, D. M. , Foat, C. M., 
& Tallmadge, G. K. Study of the Career Intern Program. Final 
report Ta^^k A: Implementation . Mountain View, CA: RMC Research 
Corporation, 1981. [UR 464] 

Winer, B. J J Statistical principles in experimental design . 
(2nd ed.)' New York: McGraw-Hill, 1971. ' 




